[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968138#comment-16968138 ] zhoukang commented on YARN-9605: new patch has pushed but error below i can not figure out the cause {code:java} [WARNING] /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/acl.pb.cc:533:13: warning: 'dynamic_init_dummy_acl_2eproto' defined but not used [-Wunused-variable] [WARNING] /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/inotify.pb.cc:467:13: warning: 'dynamic_init_dummy_inotify_2eproto' defined but not used [-Wunused-variable] [WARNING] /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/HAServiceProtocol.pb.cc:404:13: warning: 'dynamic_init_dummy_HAServiceProtocol_2eproto' defined but not used [-Wunused-variable] {code} > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch, > YARN-9605.003.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9605: --- Attachment: YARN-9605.003.patch > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch, > YARN-9605.003.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9605: --- Attachment: (was: YARN-9605.003.patch) > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch, > YARN-9605.003.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9605: --- Attachment: YARN-9605.003.patch > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch, > YARN-9605.003.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968088#comment-16968088 ] Hadoop QA commented on YARN-9605: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 2s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 59s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 15m 38s{color} | {color:red} root generated 3 new + 23 unchanged - 3 fixed = 26 total (was 26) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 35s{color} | {color:orange} root: The patch generated 22 new + 22 unchanged - 0 fixed = 44 total (was 22) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 0s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 48s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 0s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}218m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9605 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982939/YARN-9605.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite uni
[jira] [Commented] (YARN-9955) LogAggregationService Thread OOM
[ https://issues.apache.org/jira/browse/YARN-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968075#comment-16968075 ] Hadoop QA commented on YARN-9955: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} YARN-9955 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9955 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985019/9955.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25108/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > LogAggregationService Thread OOM > > > Key: YARN-9955 > URL: https://issues.apache.org/jira/browse/YARN-9955 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2, 2.7.3 >Reporter: wangxiangchun >Priority: Major > Attachments: 9955.patch, > 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg, > e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg > > > Because of some IPC problem,we found that if our recover directory stores too > much container information.When we restart nodemanager , it appears to error: > _java.lang.OutOfMemoryError: Unable to create new native thread_ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9955) LogAggregationService Thread OOM
[ https://issues.apache.org/jira/browse/YARN-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxiangchun updated YARN-9955: Attachment: e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg > LogAggregationService Thread OOM > > > Key: YARN-9955 > URL: https://issues.apache.org/jira/browse/YARN-9955 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2, 2.7.3 >Reporter: wangxiangchun >Priority: Major > Attachments: 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg, > e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg > > > Because of some IPC problem,we found that if our recover directory stores too > much container information.When we restart nodemanager , it appears to error: > _java.lang.OutOfMemoryError: Unable to create new native thread_ > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9955) LogAggregationService Thread OOM
wangxiangchun created YARN-9955: --- Summary: LogAggregationService Thread OOM Key: YARN-9955 URL: https://issues.apache.org/jira/browse/YARN-9955 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.3, 2.7.2 Reporter: wangxiangchun Because of some IPC problem,we found that if our recover directory stores too much container information.When we restart nodemanager , it appears to error: _java.lang.OutOfMemoryError: Unable to create new native thread_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968027#comment-16968027 ] kailiu_dev commented on YARN-9940: -- Dear [~yufeigu] , thanks for your replay! :). I am Looking forward to receiving your {color:#172b4d}review {color}and replay! > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Assignee: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968014#comment-16968014 ] Zhankun Tang commented on YARN-9605: [~cane], I triggered a new build and let's see. > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968002#comment-16968002 ] Íñigo Goiri commented on YARN-9768: --- We can fix the checkstyles. And as we are changing that, let's also do: {code} 60s {code} > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9954) Configurable max application tags and max tag length
Jonathan Hung created YARN-9954: --- Summary: Configurable max application tags and max tag length Key: YARN-9954 URL: https://issues.apache.org/jira/browse/YARN-9954 Project: Hadoop YARN Issue Type: Improvement Reporter: Jonathan Hung Currently max tags and max tag length is hardcoded, it should be configurable {noformat} @Evolving public static final int APPLICATION_MAX_TAGS = 10; @Evolving public static final int APPLICATION_MAX_TAG_LENGTH = 100; {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967937#comment-16967937 ] Shane Kumpf commented on YARN-9562: --- Hey [~ebadger]. Thanks for your (and everyone elses) hard work here. Overall this looks to be coming together nicely. I've taken a look at the code and have a couple of items, but nothing blocking. However, I'm having a bit of trouble getting runC containers working so far. I'm out of time to continue troubleshooting right now, but this is what I'm seeing, both dshell and MR pi do the same. Docker MR jobs are working. I am running all containers as the nobody user in this case. {code:java} 2019-11-05 22:40:14,225 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 35. Privileged Execution Operation Stderr: Bad/Missing runC int Could not create container dirs Could not create local files and directories Nonzero exit code=35, error message='Could not create work dirs' Stdout: Can't create directory /tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1572993484434_0003/container_e04_1572993484434_0003_01_02 - Permission denied Full command array for failed execution: [/usr/local/hadoop/bin/container-executor, --run-runc-container, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1572993484434_0003/container_e04_1572993484434_0003_01_02/runc-config.json]{code} Here are some questions/nits on the patch. None of these are blockers IMO. Questions/Comments: 1) Why is the keystore and truststore needed within RuncContainerExecutorConfig? 2) I'm not a big fan of hard coded mounts like this. This would also be problematic for systemd based containers where systemd expects /tmp to be a tmpfs. {code:java} addRuncMountLocation(mounts, containerWorkDir.toString() + "/private_slash_tmp", "/tmp", true, true); addRuncMountLocation(mounts, containerWorkDir.toString() + "/private_var_slash_tmp", "/var/tmp", true, true); {code} 3) It would be great to track these disabled features for future implementation. {code:java} public String getExposedPorts(Container container) { return null; } public String[] getIpAndHost(Container container) { return null; } public IOStreamPair execContainer(ContainerExecContext ctx) throws ContainerExecutionException { return null; } public void reapContainer(ContainerRuntimeContext ctx) throws ContainerExecutionException { } public void relaunchContainer(ContainerRuntimeContext ctx) throws ContainerExecutionException { } {code} Nits: 1) clean up the whitespace around Container#getContainerRuntimeData 2) RuncContainerExecutorConfig typo in class javadoc 3) YarnConfiguration DEFAULT_NM_RUNC_ALLOWED_CONTAINER_NETWORKS and DEFAULT_NM_RUNC_ALLOWED_CONTAINER_RUNTIMES - copy and paste error on the javadoc 4) Many of the tests create tmpDirs but don't appear to clean them up. TestRuncContainerRuntime creates two temp dirs, once via mkdirs and the other via a Rule. {code:java} TestDockerContainerRuntime mkdirs for tmpDir TestHdfsManifestToResouvesPlugin creates a tmpDir but doesn't clean it up TestRuncContainerRuntime has both a tmpDir and TempDir created by a @Rule {code} 5) Docs * Overview: "if created", newline after runC in second paragraph. * Docker to squash section: first paragraph "Getting" newline. * I'm fine with leaving reference to the patch to docker_to_squash.py for now until we have a better story, but I did need to do a few steps to get that tool working. 1) Create the hdfs runc-root as root 2) install skopeo, squashfs-tools, and attr. > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, > YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch, > YARN-9562.009.patch, YARN-9562.010.patch, YARN-9562.011.patch, > YARN-9562.012.patch, YARN-9562.013.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967930#comment-16967930 ] Eric Payne commented on YARN-9930: -- bq. When we migrate from FS to CS, this difference will make users be confused. [~cane] I see. So, if I understand correctly, you are saying the following: - in FS, submitting more than {{maxRunningApps}} per user will just leave the apps waiting in the submitted state and will run them once other apps from that user have completed. - The CS will refuse to submitt more than {{Max Applications Per User}}. Is that correct? If it is a requirement to change this behavior in the CS, I would at least like to see this change in behavior surrounded by a config property, with the default being the old CS behavior. > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo
[ https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967878#comment-16967878 ] Jonathan Turner Eagles edited comment on YARN-9949 at 11/5/19 9:18 PM: --- Helping to get branch-3.2 compiling again to unblock other committers. I've reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It seems like there is more work to be done for the back-port so this seems the cleanest way until it's resolved. Thanks for understanding the reason for the revert. was (Author: jeagles): Helping to get branch-3.2 compiling again to unblock other commits. I've reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It seems like there is more work to be done for the back-port so this seems the cleaned way until it's resolved. Thanks for understanding the reason for the revert. > Add missing queue configs for root queue in > RMWebService#CapacitySchedulerInfo > --- > > Key: YARN-9949 > URL: https://issues.apache.org/jira/browse/YARN-9949 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > Attachments: YARN-9949-001.patch, YARN-9949-002.patch > > > YARN-9937 has added below missing queue configs but missed to add for root > queue. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime > 5. Ordering Policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo
[ https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967878#comment-16967878 ] Jonathan Turner Eagles commented on YARN-9949: -- Helping to get branch-3.2 compiling again to unblock other commits. I've reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It seems like there is more work to be done for the back-port so this seems the cleaned way until it's resolved. Thanks for understanding the reason for the revert. > Add missing queue configs for root queue in > RMWebService#CapacitySchedulerInfo > --- > > Key: YARN-9949 > URL: https://issues.apache.org/jira/browse/YARN-9949 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > Attachments: YARN-9949-001.patch, YARN-9949-002.patch > > > YARN-9937 has added below missing queue configs but missed to add for root > queue. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime > 5. Ordering Policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9953) YARN Service dependency should be configurable for each app
[ https://issues.apache.org/jira/browse/YARN-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967829#comment-16967829 ] Eric Yang commented on YARN-9953: - The programmable API for YARN service is to maintain backward compatibility at yarnfile level instead of at private Java API calls. YARN service depends on API server, which is part of Resource Manager process. There is fair bit of dependencies between YARN framework and the YARN service application. By exposing the YARN service as configurable version, it will be harder to manage upgrades and create more obstacles for future version of YARN framework because older version of YARN service uses internal YARN API which may not work in future version of YARN. Given those reasons, we can't move forward with this patch. Sorry. > YARN Service dependency should be configurable for each app > --- > > Key: YARN-9953 > URL: https://issues.apache.org/jira/browse/YARN-9953 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-9953.001.patch > > > Currently, YARN Service dependency can be set as yarn.service.framework.path. > But, It works only as configured in RM. > This makes it impossible for the user to choose their YARN Service dependency. > It should be configurable for each app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967827#comment-16967827 ] Hadoop QA commented on YARN-9937: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 6m 39s{color} | {color:red} root in branch-3.2 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} branch-3.2 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 19s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 35s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 15 new + 2 unchanged - 10 fixed = 17 total (was 12) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 12 new + 166 unchanged - 1 fixed = 178 total (was 167) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 4 unchanged - 2 fixed = 4 total (was 6) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 36s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}105m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:63396beab41 | | JIRA Issue | YARN-9937 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984986/YARN-9937-branch-3.2.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d134ce98fe38 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/p
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967810#comment-16967810 ] Hadoop QA commented on YARN-9768: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 308 unchanged - 0 fixed = 310 total (was 308) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 18s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}183m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9768 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984971/YARN-9768.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shade
[jira] [Commented] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other
[ https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967808#comment-16967808 ] kevin su commented on YARN-9677: [~pbacsko] Thanks for the review > Make FpgaDevice and GpuDevice classes more similar to each other > > > Key: YARN-9677 > URL: https://issues.apache.org/jira/browse/YARN-9677 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: kevin su >Priority: Major > Labels: newbie, newbie++ > Attachments: YARN-9677.001.patch > > > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice > is an inner class of FpgaResourceAllocator. > It is not only being used from its parent class but from other classes as > well so we are losing the purpose of the inner class, it does not really make > sense. > We also have > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice > which is a similar class, but for GPU devices. > What we could do here is to make FpgaDevice a single class and harmonize the > packages of these 2 classes, meaning they should be "closer" to each other in > terms of packaging. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9937: Attachment: YARN-9937-branch-3.2.002.patch > Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo > > > Key: YARN-9937 > URL: https://issues.apache.org/jira/browse/YARN-9937 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, > YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, > YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch, > YARN-9937-branch-3.2.002.patch > > > Below are the missing queue configs which are not part of RMWebServices > scheduler endpoint. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967729#comment-16967729 ] Yufei Gu commented on YARN-9940: Hi [~kailiu_dev], added you to the contributor role, and assign this to you. I will try to review this later. > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Assignee: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu reassigned YARN-9940: -- Assignee: kailiu_dev > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Assignee: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]
[ https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967707#comment-16967707 ] Hadoop QA commented on YARN-9899: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 38s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 63 unchanged - 1 fixed = 63 total (was 64) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 25s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 17s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 32s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 45s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}265m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.webproxy.TestWebAppProxyServlet
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967695#comment-16967695 ] Íñigo Goiri commented on YARN-9768: --- [^YARN-9768.007.patch] looks good. Let's see what Yetus says. > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo
[ https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967684#comment-16967684 ] Siyao Meng edited comment on YARN-9949 at 11/5/19 5:16 PM: --- [~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 due to YARN-9937 hasn't landed in branch-3.2 yet: {code:title=mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation failure: [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[53,13] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[127,10] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[76,48] cannot find symbol [ERROR] symbol: method getMaximumAllocation() [ERROR] location: variable parent of type org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[79,21] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,7] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,35] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,5] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,33] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-yarn-server-resourcemanager: Compilation failure at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215) {code} I was working on another jira (backporting HADOOP-16152 to 3.2) and found this to be a potential issue for precommits. YARN-9937 doesn't seem to be a clean backport. If that's the case, please revert this on branch-3.2 so that it can be compiled. We can get this back in 3.2 after YARN-9937 is in. Thanks! was (Author: smeng): [~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 due to YARN-9937 hasn't landed in branch-3.2 yet: {code:mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation failure: [ERROR] /Users/smeng/repo/bra
[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo
[ https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967684#comment-16967684 ] Siyao Meng commented on YARN-9949: -- [~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 due to YARN-9937 hasn't landed in branch-3.2 yet: {code:mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation failure: [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[53,13] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[127,10] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[76,48] cannot find symbol [ERROR] symbol: method getMaximumAllocation() [ERROR] location: variable parent of type org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[79,21] cannot find symbol [ERROR] symbol: class QueueAclsInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,7] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,35] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,5] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] /Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,33] cannot find symbol [ERROR] symbol: class QueueAclInfo [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo [ERROR] -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-yarn-server-resourcemanager: Compilation failure at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215) {code} YARN-9937 doesn't seem to be a clean backport. If that's the case, please revert this on branch-3.2 so that it can be compiled. We can get this back in 3.2 after YARN-9937 is in. Thanks! > Add missing queue configs for root queue in > RMWebService#CapacitySchedulerInfo > --- > > Key: YARN-9949 > URL: https://issues.apache.org/jira/browse/YARN-9949 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > Attachments: YARN-9949-001.patch, YARN-9949-002.patch > > > YARN-9937 has a
[jira] [Reopened] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne reopened YARN-8292: -- Sorry for reopening, but this issue also exists in branch-2 and branch-2.10 (see the VOTE discussion thread for Hadoop 2.10.0 RC1). Unfortunately, the patch does not backport cleanly to branch-2/2.10, so I would like to reopen this JIRA and put up a patch for branch-2/2.10. I am preparing a patch now. > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch, > YARN-8292.009.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator
[ https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967660#comment-16967660 ] Eric Payne commented on YARN-8004: -- Backport completed to branch-2 and branch-2.10. > Add unit tests for inter queue preemption for dominant resource calculator > -- > > Key: YARN-8004 > URL: https://issues.apache.org/jira/browse/YARN-8004 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Zian Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.3, 2.10.1, 2.11.0 > > Attachments: YARN-8004.001.patch, YARN-8004.002.patch, > YARN-8004.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator
[ https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-8004: - Fix Version/s: 2.11.0 2.10.1 > Add unit tests for inter queue preemption for dominant resource calculator > -- > > Key: YARN-8004 > URL: https://issues.apache.org/jira/browse/YARN-8004 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Zian Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.3, 2.10.1, 2.11.0 > > Attachments: YARN-8004.001.patch, YARN-8004.002.patch, > YARN-8004.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-9768: --- Attachment: YARN-9768.007.patch > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry
[ https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967650#comment-16967650 ] Manikandan R commented on YARN-9768: Attached .007.patch. Please review. > RM Renew Delegation token thread should timeout and retry > - > > Key: YARN-9768 > URL: https://issues.apache.org/jira/browse/YARN-9768 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: CR Hota >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9768.001.patch, YARN-9768.002.patch, > YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, > YARN-9768.006.patch, YARN-9768.007.patch > > > Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews > HDFS tokens received to check for validity and expiration time. > This call is made to an underlying HDFS NN or Router Node (which has exact > APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the > thread remains stuck indefinitely. The thread should ideally timeout the > renewToken and retry from the client's perspective. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator
[ https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967606#comment-16967606 ] Eric Payne edited comment on YARN-8004 at 11/5/19 4:16 PM: --- {quote} bq. could i backport this to branch-2.9/2.8 as well? Sunil G, sure that would be fine. Thanks for the reviews and commits. {quote} It doesn't look like this happened. It backports cleanly to branch-2. Unless there are objections, I will like to backport this to branch-2. was (Author: eepayne): {quote} bq. could i backport this to branch-2.9/2.8 as well? Sunil G, sure that would be fine. Thanks for the reviews and commits. {quote} It doesn't look like this happened. I backports cleanly to branch-2. Unless there are objections, I would like to backport this to branch-2. > Add unit tests for inter queue preemption for dominant resource calculator > -- > > Key: YARN-8004 > URL: https://issues.apache.org/jira/browse/YARN-8004 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Zian Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.3 > > Attachments: YARN-8004.001.patch, YARN-8004.002.patch, > YARN-8004.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator
[ https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967606#comment-16967606 ] Eric Payne commented on YARN-8004: -- {quote} bq. could i backport this to branch-2.9/2.8 as well? Sunil G, sure that would be fine. Thanks for the reviews and commits. {quote} It doesn't look like this happened. I backports cleanly to branch-2. Unless there are objections, I would like to backport this to branch-2. > Add unit tests for inter queue preemption for dominant resource calculator > -- > > Key: YARN-8004 > URL: https://issues.apache.org/jira/browse/YARN-8004 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Zian Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.3 > > Attachments: YARN-8004.001.patch, YARN-8004.002.patch, > YARN-8004.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other
[ https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967556#comment-16967556 ] Peter Bacsko commented on YARN-9677: Thanks for the patch [~pingsutw]. Looks like a straightforward refactor. +1 (non-binding) > Make FpgaDevice and GpuDevice classes more similar to each other > > > Key: YARN-9677 > URL: https://issues.apache.org/jira/browse/YARN-9677 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: kevin su >Priority: Major > Labels: newbie, newbie++ > Attachments: YARN-9677.001.patch > > > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice > is an inner class of FpgaResourceAllocator. > It is not only being used from its parent class but from other classes as > well so we are losing the purpose of the inner class, it does not really make > sense. > We also have > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice > which is a similar class, but for GPU devices. > What we could do here is to make FpgaDevice a single class and harmonize the > packages of these 2 classes, meaning they should be "closer" to each other in > terms of packaging. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9362) Code cleanup in TestNMLeveldbStateStoreService
[ https://issues.apache.org/jira/browse/YARN-9362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967545#comment-16967545 ] Peter Bacsko commented on YARN-9362: Thanks for the patch [~denes.gerencser]. I don't have too much context, but having smaller tests instead of a big one is certainly much better. I just have two comments right now: 1) Would be good to see some nice assertion messages. However, it's missing everywhere and I don't think it's worth the hassle, so let's keep it that way. 2) I'm wondering if using {{System.currentTimeMillis()}} is necessary: {noformat} long containerStartTime = System.currentTimeMillis(); {noformat} can't we just use a constant? > Code cleanup in TestNMLeveldbStateStoreService > -- > > Key: YARN-9362 > URL: https://issues.apache.org/jira/browse/YARN-9362 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Denes Gerencser >Priority: Minor > Attachments: YARN-9362.001.patch > > > There are many ways to improve TestNMLeveldbStateStoreService: > 1. RecoveredContainerState fields are asserted many times repeatedly. Some > simple method extractions would definitely make this more readable. > 2. The tests are very long and hard to read in general: Again, finding how > methods could be extracted to avoid code repetition could help. > 3. You name it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967534#comment-16967534 ] zhoukang edited comment on YARN-9537 at 11/5/19 1:39 PM: - [~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? thanks was (Author: cane): [~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967534#comment-16967534 ] zhoukang commented on YARN-9537: [~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967533#comment-16967533 ] Hadoop QA commented on YARN-9537: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 27s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}143m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9537 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984918/YARN-9537.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ffcd33807bf3 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d17ba85 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25103/testReport/ | | Max. process+thread count | 816 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25103/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add configuration to disable AM preemp
[jira] [Commented] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]
[ https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967513#comment-16967513 ] Peter Bacsko commented on YARN-9899: Fixed a couple of minor things in v5. Also, if args.length == 0, then we display a help instead of throwing an exception. [~snemeth] could you check out patch v5 when you have some time? > Migration tool that help to generate CS config based on FS config [Phase 2] > > > Key: YARN-9899 > URL: https://issues.apache.org/jira/browse/YARN-9899 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9899-001.patch, YARN-9899-002.patch, > YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch > > > YARN-9699 laid down the groundworks of a converter from FS to CS config. > During the development of the converter, we came up with the following things > to fix. > 1. If we don't specify a mandatory option, we have this stacktrace for > example: > > {code:java} > org.apache.commons.cli.MissingOptionException: Missing required option: o > at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299) > at org.apache.commons.cli.Parser.parse(Parser.java:231) > at org.apache.commons.cli.Parser.parse(Parser.java:85) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code} > > We should provide a more concise and meaningful error message (without > stacktrace on the CLI, but we should log the exception with stacktrace to the > RM log). > An explanation of the missing option is also required. > 2. We may think about how to handle exceptions from commons CLI: > MissingArgumentException vs. MissingOptionException > 3. We need to provide a -h / --help option for the CLI that prints all the > possible options / arguments. > 4. Last but not least: We should move the CLI command to a more reasonable > place: > As YARN-9699 implemented it, the command can be invoked like: > {code:java} > /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y > /opt/hadoop/etc/hadoop/yarn-site.xml -f > /opt/hadoop/etc/hadoop/fair-scheduler.xml -r > ~systest/sample-rules-config.properties -o /tmp/fs-cs-output > {code} > This is problematic, as if YARN RM is already running, we need to stop it in > order to start the RM again with the conversion switch. > 5. Add unit test coverage for {{QueuePlacementConverter}} > 6. Close some feature gaps. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]
[ https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9899: --- Attachment: YARN-9899-005.patch > Migration tool that help to generate CS config based on FS config [Phase 2] > > > Key: YARN-9899 > URL: https://issues.apache.org/jira/browse/YARN-9899 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9899-001.patch, YARN-9899-002.patch, > YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch > > > YARN-9699 laid down the groundworks of a converter from FS to CS config. > During the development of the converter, we came up with the following things > to fix. > 1. If we don't specify a mandatory option, we have this stacktrace for > example: > > {code:java} > org.apache.commons.cli.MissingOptionException: Missing required option: o > at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299) > at org.apache.commons.cli.Parser.parse(Parser.java:231) > at org.apache.commons.cli.Parser.parse(Parser.java:85) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code} > > We should provide a more concise and meaningful error message (without > stacktrace on the CLI, but we should log the exception with stacktrace to the > RM log). > An explanation of the missing option is also required. > 2. We may think about how to handle exceptions from commons CLI: > MissingArgumentException vs. MissingOptionException > 3. We need to provide a -h / --help option for the CLI that prints all the > possible options / arguments. > 4. Last but not least: We should move the CLI command to a more reasonable > place: > As YARN-9699 implemented it, the command can be invoked like: > {code:java} > /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y > /opt/hadoop/etc/hadoop/yarn-site.xml -f > /opt/hadoop/etc/hadoop/fair-scheduler.xml -r > ~systest/sample-rules-config.properties -o /tmp/fs-cs-output > {code} > This is problematic, as if YARN RM is already running, we need to stop it in > order to start the RM again with the conversion switch. > 5. Add unit test coverage for {{QueuePlacementConverter}} > 6. Close some feature gaps. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967472#comment-16967472 ] Hadoop QA commented on YARN-9937: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 6m 32s{color} | {color:red} root in branch-3.2 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 21s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} branch-3.2 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 19s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 35s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 15 new + 2 unchanged - 10 fixed = 17 total (was 12) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 39 new + 166 unchanged - 1 fixed = 205 total (was 167) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 4 unchanged - 2 fixed = 4 total (was 6) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 35s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}115m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:63396beab41 | | JIRA Issue | YARN-9937 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984910/YARN-9937-branch-3.2.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux eb53e1e659af 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/p
[jira] [Issue Comment Deleted] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9605: --- Comment: was deleted (was: ping [~prabhujoseph] [~subru] [~tangzhankun] could you help review or retest this? thanks! I have run the related unit test locally) > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA
[ https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967436#comment-16967436 ] zhoukang commented on YARN-9605: ping [~prabhujoseph] [~subru] [~tangzhankun] could you help review or retest this? thanks! I have run the related unit test locally > Add ZkConfiguredFailoverProxyProvider for RM HA > --- > > Key: YARN-9605 > URL: https://issues.apache.org/jira/browse/YARN-9605 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-9605.001.patch, YARN-9605.002.patch > > > In this issue, i will track a new feature to support > ZkConfiguredFailoverProxyProvider for RM HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9537: --- Attachment: YARN-9537.003.patch > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo
[ https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967408#comment-16967408 ] Prabhu Joseph commented on YARN-9949: - Thanks [~ferhui] for finding the issue. This patch requires YARN-9937 as well. Have bakcported YARN-9937 to branch-3.2. > Add missing queue configs for root queue in > RMWebService#CapacitySchedulerInfo > --- > > Key: YARN-9949 > URL: https://issues.apache.org/jira/browse/YARN-9949 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > Attachments: YARN-9949-001.patch, YARN-9949-002.patch > > > YARN-9937 has added below missing queue configs but missed to add for root > queue. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime > 5. Ordering Policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967407#comment-16967407 ] Prabhu Joseph commented on YARN-9937: - Reopened to submit branch-3.2 patch. > Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo > > > Key: YARN-9937 > URL: https://issues.apache.org/jira/browse/YARN-9937 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, > YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, > YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch > > > Below are the missing queue configs which are not part of RMWebServices > scheduler endpoint. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9937: Attachment: YARN-9937-branch-3.2.001.patch > Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo > > > Key: YARN-9937 > URL: https://issues.apache.org/jira/browse/YARN-9937 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, > YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, > YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch > > > Below are the missing queue configs which are not part of RMWebServices > scheduler endpoint. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph reopened YARN-9937: - > Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo > > > Key: YARN-9937 > URL: https://issues.apache.org/jira/browse/YARN-9937 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, > YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, > YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch > > > Below are the missing queue configs which are not part of RMWebServices > scheduler endpoint. > 1. Maximum Allocation > 2. Queue ACLs > 3. Queue Priority > 4. Application Lifetime -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967374#comment-16967374 ] Jun Gong edited comment on YARN-9948 at 11/5/19 9:31 AM: - [~ziqian hu] Thanks for the patch. How much memory does it consume? If not much, it will be better to keep them, the reasons could be found at [comment 1|https://issues.apache.org/jira/browse/YARN-3480?focusedCommentId=15059193&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059193] and [comment 2|https://issues.apache.org/jira/browse/YARN-3480?focusedCommentId=15059399&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059399]. was (Author: hex108): [~ziqian hu] Thanks for the patch. How much memory does it consume? If not much, it will be better to keep them, the reasons could be found at comment 1 and comment 2. > Remove attempts that are beyond max-attempt limit from RMAppImpl > > > Key: YARN-9948 > URL: https://issues.apache.org/jira/browse/YARN-9948 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.1.3 >Reporter: Hu Ziqian >Priority: Major > Attachments: YARN-9948.001.patch > > > RM will store app attempt in both state store and RMAppImpl. YARN-3480 > removes attempts that are beyond max-attempt limit from state store. In this > issue we delete those attempts in RMAppImpl the reduce decrease memory usage > of RM. > We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to > enable this logic, default value is false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9923) Detect missing Docker binary or not running Docker daemon
[ https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967375#comment-16967375 ] Szilard Nemeth commented on YARN-9923: -- Hi [~adam.antal] / [~ebadger]! If I understood Adam's latest comment correctly, if a user set the flag to NONE, it would have the current behaviour. If it's set to RUNTIME, the NM health check script would run, as suggested by [~ebadger]. The STARTUP option is clear, it just checks the daemon at startup. I would vote to implement this option as well, for better flexibility. What do you guys think? > Detect missing Docker binary or not running Docker daemon > - > > Key: YARN-9923 > URL: https://issues.apache.org/jira/browse/YARN-9923 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn >Affects Versions: 3.2.1 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > > Currently if a NodeManager is enabled to allocate Docker containers, but the > specified binary (docker.binary in the container-executor.cfg) is missing the > container allocation fails with the following error message: > {noformat} > Container launch fails > Exit code: 29 > Exception message: Launch container failed > Shell error output: sh: : No > such file or directory > Could not inspect docker network to get type /usr/bin/docker network inspect > host --format='{{.Driver}}'. > Error constructing docker command, docker error code=-1, error > message='Unknown error' > {noformat} > I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" > to have the following options: > - STARTUP: setting this option the NodeManager would not start if Docker > binaries are missing or the Docker daemon is not running (the exception is > considered FATAL during startup) > - RUNTIME: would give a more detailed/user-friendly exception in > NodeManager's side (NM logs) if Docker binaries are missing or the daemon is > not working. This would also prevent further Docker container allocation as > long as the binaries do not exist and the docker daemon is not running. > - NONE (default): preserving the current behaviour, throwing exception during > container allocation, carrying on using the default retry procedure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967374#comment-16967374 ] Jun Gong commented on YARN-9948: [~ziqian hu] Thanks for the patch. How much memory does it consume? If not much, it will be better to keep them, the reasons could be found at comment 1 and comment 2. > Remove attempts that are beyond max-attempt limit from RMAppImpl > > > Key: YARN-9948 > URL: https://issues.apache.org/jira/browse/YARN-9948 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.1.3 >Reporter: Hu Ziqian >Priority: Major > Attachments: YARN-9948.001.patch > > > RM will store app attempt in both state store and RMAppImpl. YARN-3480 > removes attempts that are beyond max-attempt limit from state store. In this > issue we delete those attempts in RMAppImpl the reduce decrease memory usage > of RM. > We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to > enable this logic, default value is false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kailiu_dev updated YARN-9940: - Comment: was deleted (was: exception is same ,but my probleam is ablout continuous scheduling, that is about FSParentQueue) > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967348#comment-16967348 ] kailiu_dev edited comment on YARN-9940 at 11/5/19 9:20 AM: --- {color:#ff}YARN-8436 && YARN-9940 exception is same ,but my probleam is ablout continuous scheduling, that is about FSParentQueue{color} was (Author: kailiu_dev): {color:#FF}exception is same ,but my probleam is ablout continuous scheduling, that is about FSParentQueue{color} > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967348#comment-16967348 ] kailiu_dev commented on YARN-9940: -- {color:#FF}exception is same ,but my probleam is ablout continuous scheduling, that is about FSParentQueue{color} > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967347#comment-16967347 ] kailiu_dev commented on YARN-9940: -- exception is same ,but my probleam is ablout continuous scheduling, that is about FSParentQueue > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967341#comment-16967341 ] Hu Ziqian commented on YARN-9948: - [~hex108], [~jianhe] , this issue is based on YARN-3480, could you review it and give some adivce? > Remove attempts that are beyond max-attempt limit from RMAppImpl > > > Key: YARN-9948 > URL: https://issues.apache.org/jira/browse/YARN-9948 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.1.3 >Reporter: Hu Ziqian >Priority: Major > Attachments: YARN-9948.001.patch > > > RM will store app attempt in both state store and RMAppImpl. YARN-3480 > removes attempts that are beyond max-attempt limit from state store. In this > issue we delete those attempts in RMAppImpl the reduce decrease memory usage > of RM. > We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to > enable this logic, default value is false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8436) FSParentQueue: Comparison method violates its general contract
[ https://issues.apache.org/jira/browse/YARN-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967339#comment-16967339 ] kailiu_dev commented on YARN-8436: -- Dear [~wilfreds] can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' in hadoop version-2.7.2 https://issues.apache.org/jira/browse/YARN-9940 > FSParentQueue: Comparison method violates its general contract > -- > > Key: YARN-8436 > URL: https://issues.apache.org/jira/browse/YARN-8436 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Minor > Fix For: 3.2.0 > > Attachments: YARN-8436.001.patch, YARN-8436.002.patch, > YARN-8436.003.patch > > > The ResourceManager can fail while sorting queues if an update comes in: > {code:java} > FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeLo(TimSort.java:777) > at java.util.TimSort.mergeAt(TimSort.java:514) > ... > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:223){code} > The reason it breaks is a change in the sorted object itself. > This is why it fails: > * an update from a node comes in as a heartbeat. > * the update triggers a check to see if we can assign a container on the > node. > * walk over the queue hierarchy to find a queue to assign a container to: > top down. > * for each parent queue we sort the child queues in {{assignContainer}} to > decide which queue to descent into. > * we lock the parent queue when sort to prevent changes, but we do not lock > the child queues that we are sorting. > If during this sorting a different node update changes a child queue then we > allow that. This means that the objects that we are trying to sort now might > be out of order. That causes the issue with the comparator. The comparator > itself is not broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6448) Continuous scheduling thread crashes while sorting nodes
[ https://issues.apache.org/jira/browse/YARN-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967334#comment-16967334 ] kailiu_dev edited comment on YARN-6448 at 11/5/19 8:44 AM: --- Dear [~yufeigu] h5. can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' in hadoop version-2.7.2, https://issues.apache.org/jira/browse/YARN-9940 was (Author: kailiu_dev): Dear [~yufeigu] h5. can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' in hadoop version-2.7.2 , > Continuous scheduling thread crashes while sorting nodes > > > Key: YARN-6448 > URL: https://issues.apache.org/jira/browse/YARN-6448 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Yufei Gu >Priority: Major > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-6448.001.patch, YARN-6448.002.patch, > YARN-6448.003.patch, YARN-6448.004.patch > > > YARN-4719 remove the lock in continuous scheduling while sorting nodes. It > breaks the order in comparison if nodes changes while sorting. > {code} > 2017-04-04 23:42:26,123 FATAL > org.apache.hadoop.yarn.server.resourcemanager.RMCriticalThreadUncaughtExceptionHandler: > Critical thread FairSchedulerContinuousScheduling crashed! > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:306) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:884) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:316) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967336#comment-16967336 ] Hadoop QA commented on YARN-9927: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-9927 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9927 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984902/YARN-9927.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25101/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hcarrot updated YARN-9927: -- Attachment: (was: YARN-9927.001.patch) > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6448) Continuous scheduling thread crashes while sorting nodes
[ https://issues.apache.org/jira/browse/YARN-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967334#comment-16967334 ] kailiu_dev commented on YARN-6448: -- Dear [~yufeigu] h5. can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' in hadoop version-2.7.2 , > Continuous scheduling thread crashes while sorting nodes > > > Key: YARN-6448 > URL: https://issues.apache.org/jira/browse/YARN-6448 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Yufei Gu >Priority: Major > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-6448.001.patch, YARN-6448.002.patch, > YARN-6448.003.patch, YARN-6448.004.patch > > > YARN-4719 remove the lock in continuous scheduling while sorting nodes. It > breaks the order in comparison if nodes changes while sorting. > {code} > 2017-04-04 23:42:26,123 FATAL > org.apache.hadoop.yarn.server.resourcemanager.RMCriticalThreadUncaughtExceptionHandler: > Critical thread FairSchedulerContinuousScheduling crashed! > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:306) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:884) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:316) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hcarrot updated YARN-9927: -- Attachment: (was: YARN-9927-addMultiEventDispatcher.patch) > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hcarrot updated YARN-9927: -- Attachment: YARN-9927.001.patch > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927-addMultiEventDispatcher.patch, YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967331#comment-16967331 ] kailiu_dev commented on YARN-9940: -- h5. one the above report, wen you see "{color:#FF}Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.7.2:protoc (compile-protoc) on project hadoop-yarn-server-resourcemanager: org.apache.maven.plugin.MojoExecutionException: 'protoc --version' did not return a version -> [Help 1]{color}", you don'nt need to care, this is not my problem, it is beause of this test enviroment of the Jira server is not *suitable* > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967326#comment-16967326 ] Hadoop QA commented on YARN-9940: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.7.2 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 2m 10s{color} | {color:red} root in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 37s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} branch-2.7.2 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 14s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 9 new + 614 unchanged - 0 fixed = 623 total (was 614) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 6m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:date2019-11-05 | | JIRA Issue | YARN-9940 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984900/YARN-9940-branch-2.7.2.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7db85f5e17e6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2.7.2 / b165c4f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-mvninstall-root.txt | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | mvnsite | https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-se
[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966386#comment-16966386 ] kailiu_dev edited comment on YARN-9940 at 11/5/19 8:25 AM: --- h5. Dear [~zxu] [~snemeth] [~bibinchundatt] can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' in hadoop version-2.7.2 was (Author: kailiu_dev): h5. [~zxu] [~snemeth] can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966386#comment-16966386 ] kailiu_dev edited comment on YARN-9940 at 11/5/19 8:24 AM: --- h5. [~zxu] [~snemeth] can you please help me review this code? this is a fixed code to avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract' was (Author: kailiu_dev): [~zxu] [~snemeth] can you please help me review this code? > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kailiu_dev updated YARN-9940: - Attachment: (was: YARN-9940-branch-2.7.2.001.patch) > avoid continuous scheduling thread crashes while sorting nodes get > 'Comparison method violates its general contract' > > > Key: YARN-9940 > URL: https://issues.apache.org/jira/browse/YARN-9940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.2 >Reporter: kailiu_dev >Priority: Major > Fix For: 2.7.2 > > Attachments: YARN-9940-branch-2.7.2.001.patch > > > 2019-10-16 09:14:51,215 ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeForceCollapse(TimSort.java:426) > at java.util.TimSort.sort(TimSort.java:223) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org