[jira] [Comment Edited] (YARN-10761) Add more event type to RM Dispatcher event metrics.

2021-05-14 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344483#comment-17344483 ] Qi Zhu edited comment on YARN-10761 at 5/14/21, 9:19 AM: - Thanks [~snemeth] for

[jira] [Commented] (YARN-9698) [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler

2021-05-14 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344630#comment-17344630 ] Qi Zhu commented on YARN-9698: -- Thanks [~pbacsko] for reminder. I agree with you that we can creating a new 

[jira] [Created] (YARN-10764) Add rm dispatcher event metrics in SLS

2021-05-08 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10764: - Summary: Add rm dispatcher event metrics in SLS Key: YARN-10764 URL: https://issues.apache.org/jira/browse/YARN-10764 Project: Hadoop YARN Issue Type: Sub-task

[jira] [Updated] (YARN-10764) Add rm dispatcher event metrics in SLS

2021-05-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10764: -- Description: We should use SLS to confirm if we can get performance improvement of event consume time etc. >

[jira] [Commented] (YARN-10761) Add more event type to RM Dispatcher event metrics.

2021-05-10 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341986#comment-17341986 ] Qi Zhu commented on YARN-10761: --- Thanks [~gandras] for your review.  > Add more event type to RM

[jira] [Commented] (YARN-10781) The Thread of the NM aggregate log is exhausted and no other Application can aggregate the log

2021-05-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351456#comment-17351456 ] Qi Zhu commented on YARN-10781: --- [~zhangxiping] If you enabled rolling log aggregation for long running

[jira] [Commented] (YARN-10770) container-executor permission is wrong in SecureContainer.md

2021-05-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350807#comment-17350807 ] Qi Zhu commented on YARN-10770: --- Thanks [~aajisaka] for good finding, [~sahuja] for patch. The patch LGTM

[jira] [Created] (YARN-10785) Yarn NodeManager aux-services should support trim.

2021-05-24 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10785: - Summary: Yarn NodeManager aux-services should support trim. Key: YARN-10785 URL: https://issues.apache.org/jira/browse/YARN-10785 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-10786) Federation:We can't access the AM page while using federation

2021-05-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350861#comment-17350861 ] Qi Zhu commented on YARN-10786: --- Thanks [~Song Jiacheng] for report this. Can you add some image to

[jira] [Commented] (YARN-10786) Federation:We can't access the AM page while using federation

2021-05-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350939#comment-17350939 ] Qi Zhu commented on YARN-10786: --- Thanks [~Song Jiacheng] for contribution. The patch LGTM. +1 cc 

[jira] [Commented] (YARN-10795) Improve Capacity Scheduler reinitialisation performance

2021-05-31 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354809#comment-17354809 ] Qi Zhu commented on YARN-10795: --- Thanks [~gandras] for this work. It will be very helpful to clusters with

[jira] [Commented] (YARN-10781) The Thread of the NM aggregate log is exhausted and no other Application can aggregate the log

2021-05-24 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350255#comment-17350255 ] Qi Zhu commented on YARN-10781: --- [~zhangxiping] It only init app and create the thread pool, when AM

[jira] [Comment Edited] (YARN-10781) The Thread of the NM aggregate log is exhausted and no other Application can aggregate the log

2021-05-24 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350255#comment-17350255 ] Qi Zhu edited comment on YARN-10781 at 5/24/21, 6:02 AM: - [~zhangxiping] It only

[jira] [Commented] (YARN-10783) Allow definition of auto queue template properties in root

2021-05-24 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350262#comment-17350262 ] Qi Zhu commented on YARN-10783: --- Thanks [~gandras] for this. The patch LGTM +1. > Allow definition of

[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%

2021-06-03 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356214#comment-17356214 ] Qi Zhu commented on YARN-10796: --- Thanks [~pbacsko] the latest patch LGTM +1. And i agree with you the

[jira] [Commented] (YARN-10522) Document for Flexible Auto Queue Creation in Capacity Scheduler.

2021-06-03 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356213#comment-17356213 ] Qi Zhu commented on YARN-10522: --- Thanks for [~bteke] taking this. I assigned it to you.   > Document for

[jira] [Assigned] (YARN-10522) Document for Flexible Auto Queue Creation in Capacity Scheduler.

2021-06-03 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu reassigned YARN-10522: - Assignee: (was: Ankit Kumar) > Document for Flexible Auto Queue Creation in Capacity Scheduler. >

[jira] [Assigned] (YARN-10522) Document for Flexible Auto Queue Creation in Capacity Scheduler.

2021-06-03 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu reassigned YARN-10522: - Assignee: Benjamin Teke > Document for Flexible Auto Queue Creation in Capacity Scheduler. >

[jira] [Commented] (YARN-10807) Parents node labels are incorrectly added to child queues in weight mode

2021-06-07 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358486#comment-17358486 ] Qi Zhu commented on YARN-10807: --- Thanks [~bteke] for this work. If we can skip the not existed also in sum

[jira] [Comment Edited] (YARN-10807) Parents node labels are incorrectly added to child queues in weight mode

2021-06-07 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358486#comment-17358486 ] Qi Zhu edited comment on YARN-10807 at 6/7/21, 9:54 AM: Thanks [~bteke] for this

[jira] [Commented] (YARN-10789) RM HA startup can fail due to race conditions in ZKConfigurationStore

2021-06-03 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356504#comment-17356504 ] Qi Zhu commented on YARN-10789: --- Thanks [~tarunparimi] for this work. The latest patch LGTM. +1 > RM HA

[jira] [Commented] (YARN-10771) Add cluster metric for size of SchedulerEventQueue and RMEventQueue

2021-05-24 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350482#comment-17350482 ] Qi Zhu commented on YARN-10771: --- Thanks [~chaosju] for contribution and [~pbacsko] for review. The test is

[jira] [Commented] (YARN-10846) Add dispatcher metrics to NM

2021-07-06 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17376168#comment-17376168 ] Qi Zhu commented on YARN-10846: --- Thanks [~chaosju] for this patch. We'd better to add a unit test for

[jira] [Commented] (YARN-10657) We should make max application per queue to support node label.

2021-07-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377715#comment-17377715 ] Qi Zhu commented on YARN-10657: --- Thanks [~gandras] for deep into. I agree with you that it's hard to have

[jira] [Commented] (YARN-10701) The yarn.resource-types should support multi types without trimmed.

2021-05-19 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347679#comment-17347679 ] Qi Zhu commented on YARN-10701: --- The test is not related this jira. Committed to branch-3.3. Thanks

[jira] [Commented] (YARN-6523) Optimize system credentials sent in node heartbeat responses

2021-04-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331735#comment-17331735 ] Qi Zhu commented on YARN-6523: -- [~maniraj...@gmail.com] [~Naganarasimha] [~jlowe] Can we backport this to

[jira] [Commented] (YARN-10637) We should support fs to cs support for auto refresh queues when conf changed, after YARN-10623 finished.

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331974#comment-17331974 ] Qi Zhu commented on YARN-10637: --- Thanks [~gandras] for confirm. [~pbacsko] do you have any comments?

[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331981#comment-17331981 ] Qi Zhu commented on YARN-9615: -- Thanks [~chaosju] for concern. I will improve it later. And i also will

[jira] [Commented] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331984#comment-17331984 ] Qi Zhu commented on YARN-10739: --- Thanks [~gandras] for review. Good suggestion, i have updated in latest

[jira] [Updated] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10739: -- Attachment: YARN-10739.004.patch > GenericEventHandler.printEventQueueDetails cause RM recovery cost too much

[jira] [Commented] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-24 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331235#comment-17331235 ] Qi Zhu commented on YARN-10739: --- [~ebadger] [~pbacsko] [~gandras] Could you take a look this when you are

[jira] [Updated] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10739: -- Attachment: YARN-10739.005.patch > GenericEventHandler.printEventQueueDetails cause RM recovery cost too much

[jira] [Commented] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332433#comment-17332433 ] Qi Zhu commented on YARN-10739: --- Thanks a lot [~pbacsko] for review. Very value suggestions, updated in

[jira] [Updated] (YARN-10739) GenericEventHandler.printEventQueueDetails cause RM recovery cost too much time

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10739: -- Attachment: YARN-10739.006.patch > GenericEventHandler.printEventQueueDetails cause RM recovery cost too much

[jira] [Updated] (YARN-10754) RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10754: -- Description: As  Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews HDFS tokens

[jira] [Created] (YARN-10754) RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

2021-04-26 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10754: - Summary: RM Renew Delegation token thread should timeout and retry should also consider app new submitted. Key: YARN-10754 URL: https://issues.apache.org/jira/browse/YARN-10754

[jira] [Commented] (YARN-10754) RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332904#comment-17332904 ] Qi Zhu commented on YARN-10754: --- cc [~ebadger] [~epayne]   [~Jim_Brennan]  [~snemeth] [~pbacsko] [~gandras]

[jira] [Updated] (YARN-10754) RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10754: -- Description: As  YARN-9768 described: Delegation token renewer thread in RM (DelegationTokenRenewer.java)

[jira] [Updated] (YARN-10754) RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10754: -- Attachment: image-2021-04-27-11-38-29-162.png > RM Renew Delegation token thread should timeout and retry

[jira] [Commented] (YARN-10722) Improvement to DelegationTokenRenewer in RM

2021-04-26 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332907#comment-17332907 ] Qi Zhu commented on YARN-10722: --- [~fengnanli]  You can improve this by YARN-9768 But it only consider the

[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332972#comment-17332972 ] Qi Zhu commented on YARN-10707: --- Thanks [~ebadger] for patient review. We could add all plugin

[jira] [Comment Edited] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332972#comment-17332972 ] Qi Zhu edited comment on YARN-10707 at 4/27/21, 6:25 AM: - Thanks [~ebadger] for

[jira] [Updated] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10707: -- Attachment: YARN-10707.008.patch > Support custom resources in ResourceUtilization, and update Node GPU >

[jira] [Updated] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10707: -- Attachment: YARN-10707.009.patch > Support custom resources in ResourceUtilization, and update Node GPU >

[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17333064#comment-17333064 ] Qi Zhu commented on YARN-10707: --- Fixed checkstyle in latest patch. :D > Support custom resources in

[jira] [Commented] (YARN-10739) GenericEventHandler.printEventQueueDetails causes RM recovery to take too much time

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17333194#comment-17333194 ] Qi Zhu commented on YARN-10739: --- Thanks [~pbacsko] for commit. >

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330871#comment-17330871 ] Qi Zhu commented on YARN-10743: --- Thanks [~Jim_Brennan] for review. Fixed checkstyle and add document in 

[jira] [Updated] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10743: -- Attachment: YARN-10743.003.patch > Add a policy for not aggregating for containers which are killed because >

[jira] [Created] (YARN-10751) Add document for yarn log aggregation policies.

2021-04-23 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10751: - Summary: Add document for yarn log aggregation policies. Key: YARN-10751 URL: https://issues.apache.org/jira/browse/YARN-10751 Project: Hadoop YARN Issue Type:

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331141#comment-17331141 ] Qi Zhu commented on YARN-10743: --- Thanks [~Jim_Brennan] for commit. I created a following Jira  YARN-10751 

[jira] [Updated] (YARN-10751) Add document for yarn log aggregation policies.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10751: -- Description: As discussed in 

[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-9927: - Attachment: YARN-9927.005.patch > RM multi-thread event processing mechanism >

[jira] [Comment Edited] (YARN-9927) RM multi-thread event processing mechanism

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334565#comment-17334565 ] Qi Zhu edited comment on YARN-9927 at 4/28/21, 8:29 AM: [~ebadger]  [~pbacsko] 

[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334606#comment-17334606 ] Qi Zhu commented on YARN-10707: --- Fixed java doc in latest patch. > Support custom resources in

[jira] [Updated] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10707: -- Attachment: YARN-10707.011.patch > Support custom resources in ResourceUtilization, and update Node GPU >

[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334565#comment-17334565 ] Qi Zhu commented on YARN-9927: -- [~ebadger] [~gandras] [~epayne]  Updated a patch to improve: 1. The event

[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334430#comment-17334430 ] Qi Zhu commented on YARN-10707: --- Thanks [~ebadger] very good suggestions. Updated in latest patch. I also

[jira] [Comment Edited] (YARN-7713) Add parallel copying of directories into FSDownload

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334461#comment-17334461 ] Qi Zhu edited comment on YARN-7713 at 4/28/21, 5:54 AM: [~ChrisKarampeazis] I

[jira] [Commented] (YARN-7713) Add parallel copying of directories into FSDownload

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334461#comment-17334461 ] Qi Zhu commented on YARN-7713: -- [~ChrisKarampeazis] I agree with [~ebadger] that we'd better split the list

[jira] [Updated] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-27 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10707: -- Attachment: YARN-10707.010.patch > Support custom resources in ResourceUtilization, and update Node GPU >

[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:46 AM: - Thanks [~Jim_Brennan] 

[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:48 AM: - Thanks [~Jim_Brennan] 

[jira] [Commented] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335104#comment-17335104 ] Qi Zhu commented on YARN-10738: --- Thanks [~Jim_Brennan] for review and very patient investigation. The

[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:45 AM: - Thanks [~Jim_Brennan] 

[jira] [Commented] (YARN-9443) Fast RM Failover using Ratis (Raft protocol)

2021-04-29 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335205#comment-17335205 ] Qi Zhu commented on YARN-9443: -- [~prabhujoseph] [~ztang] [~ebadger] [~epayne] Is this going on, now the

[jira] [Commented] (YARN-7713) Add parallel copying of directories into FSDownload

2021-04-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331460#comment-17331460 ] Qi Zhu commented on YARN-7713: -- [~ChrisKarampeazis] Thanks for your work here.  cc [~ebadger] [~epayne]

[jira] [Comment Edited] (YARN-10637) We should support fs to cs support for auto refresh queues when conf changed, after YARN-10623 finished.

2021-04-25 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316917#comment-17316917 ] Qi Zhu edited comment on YARN-10637 at 4/25/21, 9:35 AM: - cc  [~gandras] Could

[jira] [Commented] (YARN-10524) Support multi resource type based weight mode in CS.

2021-05-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339376#comment-17339376 ] Qi Zhu commented on YARN-10524: --- Thanks [~gandras] for concern.  I think YARN-9936 will cover all this use

[jira] [Comment Edited] (YARN-9927) RM multi-thread event processing mechanism

2021-05-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339383#comment-17339383 ] Qi Zhu edited comment on YARN-9927 at 5/5/21, 3:05 AM: --- Great review and

[jira] [Commented] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2021-05-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339378#comment-17339378 ] Qi Zhu commented on YARN-10517: --- Thanks [~zhanqi.cai] for report. Could you apply the latest patch to your

[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism

2021-05-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339383#comment-17339383 ] Qi Zhu commented on YARN-9927: -- Great review and investigation! Thanks very much  [~ebadger] [~ebadger] . I

[jira] [Commented] (YARN-10592) Support QueueCapacities to use vector based multi resource types, and update absolute related to use first.

2021-05-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339375#comment-17339375 ] Qi Zhu commented on YARN-10592: --- [~gandras] I think  YARN-9936 would cover this. > Support

[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.

2021-04-28 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334756#comment-17334756 ] Qi Zhu commented on YARN-10707: --- The failed time out test is not related, passed locally. > Support custom

[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280899#comment-17280899 ] Qi Zhu commented on YARN-10532: --- [~gandras] I have updated it in testAutoCreateQueueAfterRemoval in latest

[jira] [Comment Edited] (YARN-10609) Update the document for YARN-10531(Be able to disable user limit factor for CapacityScheduler Leaf Queue)

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277137#comment-17277137 ] Qi Zhu edited comment on YARN-10609 at 2/8/21, 10:44 AM: - cc [~wangda] [~snemeth] 

[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280942#comment-17280942 ] Qi Zhu commented on YARN-9927: -- [~hcarrot] [~leftnoteasy] [~adam.antal] [~epayne] Is it going on? I think it

[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280855#comment-17280855 ] Qi Zhu commented on YARN-10532: --- Thanks a lot [~gandras] for your patient review. You are right, after the

[jira] [Commented] (YARN-10513) CS Flexible Auto Queue Creation RM UIv2 modifications

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280900#comment-17280900 ] Qi Zhu commented on YARN-10513: --- Thanks  [~gandras] for your contribution. I agree with you, this should

[jira] [Comment Edited] (YARN-9927) RM multi-thread event processing mechanism

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280942#comment-17280942 ] Qi Zhu edited comment on YARN-9927 at 2/8/21, 10:49 AM: [~hcarrot] [~leftnoteasy]

[jira] [Updated] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10532: -- Attachment: YARN-10532.020.patch > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue

[jira] [Comment Edited] (YARN-10609) Update the document for YARN-10531(Be able to disable user limit factor for CapacityScheduler Leaf Queue)

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277137#comment-17277137 ] Qi Zhu edited comment on YARN-10609 at 2/8/21, 10:43 AM: - cc [~wangda] [~snemeth] 

[jira] [Commented] (YARN-10593) Fix incorrect string comparison in GpuDiscoverer

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281042#comment-17281042 ] Qi Zhu commented on YARN-10593: --- Thanks [~pbacsko] for contribution. The patch LGTM. > Fix incorrect

[jira] [Commented] (YARN-10609) Update the document for YARN-10531(Be able to disable user limit factor for CapacityScheduler Leaf Queue)

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281039#comment-17281039 ] Qi Zhu commented on YARN-10609: --- Thanks  a lot [~bteke] for review. Your description is more reasonable

[jira] [Comment Edited] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-08 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280899#comment-17280899 ] Qi Zhu edited comment on YARN-10532 at 2/8/21, 1:11 PM: [~gandras] I have

[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279370#comment-17279370 ] Qi Zhu commented on YARN-10532: --- Thanks for [~gandras] valid suggestions! I have updated a new patch.  1.

[jira] [Commented] (YARN-9650) Set thread names for CapacityScheduler AsyncScheduleThread

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-9650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279386#comment-17279386 ] Qi Zhu commented on YARN-9650: -- Thanks for [~amoghdesai] contribution. LGTM +1.   > Set thread names for

[jira] [Commented] (YARN-10589) Improve logic of multi-node allocation

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279389#comment-17279389 ] Qi Zhu commented on YARN-10589: --- Thanks for [~tanu.ajmera] new patch. The new code LGTM. Should also fix

[jira] [Updated] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10532: -- Attachment: YARN-10532.015.patch > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue

[jira] [Commented] (YARN-10615) Fix Auto Queue Creation hierarchy construction to use queue path instead of short queue name

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279380#comment-17279380 ] Qi Zhu commented on YARN-10615: --- Thanks for [~gandras] patch. I agree with that we should use full path to

[jira] [Updated] (YARN-10610) Add queuePath to restful api for CapacityScheduler consistent with FairScheduler queuePath.

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10610: -- Attachment: YARN-10610.003.patch > Add queuePath to restful api for CapacityScheduler consistent with >

[jira] [Commented] (YARN-10611) Fix that shaded should be used for google guava imports in YARN-10352.

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279372#comment-17279372 ] Qi Zhu commented on YARN-10611: --- Thanks for [~ebadger]  review and commit. > Fix that shaded should be

[jira] [Comment Edited] (YARN-10610) Add queuePath to restful api for CapacityScheduler consistent with FairScheduler queuePath.

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278079#comment-17278079 ] Qi Zhu edited comment on YARN-10610 at 2/5/21, 6:25 AM:  [~snemeth]  [~shuzirra]  

[jira] [Commented] (YARN-10610) Add queuePath to restful api for CapacityScheduler consistent with FairScheduler queuePath.

2021-02-04 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279402#comment-17279402 ] Qi Zhu commented on YARN-10610: --- Thanks for [~ztang] review. I have updated a patch to fix checkstyle. >

[jira] [Updated] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10532: -- Attachment: YARN-10532.016.patch > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue

[jira] [Commented] (YARN-10610) Add queuePath to restful api for CapacityScheduler consistent with FairScheduler queuePath.

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279538#comment-17279538 ] Qi Zhu commented on YARN-10610: --- Thanks [~gandras] for review. Now there are non-bindings, i think.

[jira] [Updated] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract'

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10178: -- Attachment: YARN-10178.005.patch > Global Scheduler async thread crash caused by 'Comparison method violates

[jira] [Commented] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract'

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279476#comment-17279476 ] Qi Zhu commented on YARN-10178: --- Fixed check style and related things in latest patch. > Global Scheduler

[jira] [Comment Edited] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract'

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270581#comment-17270581 ] Qi Zhu edited comment on YARN-10178 at 2/5/21, 8:54 AM: cc [~wangda]  [~ztang]

[jira] [Comment Edited] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract'

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270581#comment-17270581 ] Qi Zhu edited comment on YARN-10178 at 2/5/21, 8:54 AM: cc [~wangda] 

[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-02-05 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279521#comment-17279521 ] Qi Zhu commented on YARN-10532: --- [~gandras] The test error is not related to this jira. Fixed check style

<    1   2   3   4   5   6   7   8   9   10   >