[jira] [Updated] (YARN-6288) Refactor AppLogAggregatorImpl#uploadLogsForContainers
[ https://issues.apache.org/jira/browse/YARN-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-6288: Attachment: YARN-6288.03.patch 03 patch * Removed unnecessary initialization for mock * Refactored LogWriter.close > Refactor AppLogAggregatorImpl#uploadLogsForContainers > - > > Key: YARN-6288 > URL: https://issues.apache.org/jira/browse/YARN-6288 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Minor > Labels: supportability > Attachments: YARN-6288.01.patch, YARN-6288.02.patch, > YARN-6288.03.patch > > > In AppLogAggregatorImpl.java, if an exception occurs in writing container log > to remote filesystem, the exception is not caught and ignored. > https://github.com/apache/hadoop/blob/f59e36b4ce71d3019ab91b136b6d7646316954e7/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java#L398 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6288) Refactor AppLogAggregatorImpl#uploadLogsForContainers
[ https://issues.apache.org/jira/browse/YARN-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-6288: Attachment: YARN-6288.02.patch Reflected Haibo's comment. Thanks! > Refactor AppLogAggregatorImpl#uploadLogsForContainers > - > > Key: YARN-6288 > URL: https://issues.apache.org/jira/browse/YARN-6288 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Minor > Labels: supportability > Attachments: YARN-6288.01.patch, YARN-6288.02.patch > > > In AppLogAggregatorImpl.java, if an exception occurs in writing container log > to remote filesystem, the exception is not caught and ignored. > https://github.com/apache/hadoop/blob/f59e36b4ce71d3019ab91b136b6d7646316954e7/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java#L398 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6329) Remove unnecessary TODO comment from AppLogAggregatorImpl.java
Akira Ajisaka created YARN-6329: --- Summary: Remove unnecessary TODO comment from AppLogAggregatorImpl.java Key: YARN-6329 URL: https://issues.apache.org/jira/browse/YARN-6329 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.8.0 Reporter: Akira Ajisaka Priority: Minor After YARN-3116, this TODO comment is unnecessary. {code} // TODO: The condition: containerId.getId() == 1 to determine an AM container // is not always true. private boolean shouldUploadLogs(ContainerLogContext logContext) { return logAggPolicy.shouldDoLogAggregation(logContext); } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6319) race condition between deleting app dir and deleting container dir
[ https://issues.apache.org/jira/browse/YARN-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906816#comment-15906816 ] Hong Zhiguo commented on YARN-6319: --- The race condition could be reproduced by below script: {code} USER=xxx GRP=yyy CE=/PATH/TO/container-executor SIZE=200 mkdir app mkdir -p app/container dd if=/dev/zero of=app/container/a count=$SIZE bs=1M dd if=/dev/zero of=app/container/b count=$SIZE bs=1M dd if=/dev/zero of=app/container/c count=$SIZE bs=1M dd if=/dev/zero of=app/container/d count=$SIZE bs=1M dd if=/dev/zero of=app/container/e count=$SIZE bs=1M chown $USER:$GRP -R app/ $CE $USER 3 ./app/container & $CE $USER 3 ./app {code} > race condition between deleting app dir and deleting container dir > -- > > Key: YARN-6319 > URL: https://issues.apache.org/jira/browse/YARN-6319 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo > > Last container (on one node) of one app complete > |--> triggers async deletion of container dir (container cleanup) > |--> triggers async deletion of app dir (app cleanup) > For LCE, deletion is done by container-executor. The "app cleanup" lists > sub-dir (step 1), and then unlink items one by one(step 2). If a file is > deleted by "container cleanup" between step 1 and step2, it'll report below > error and breaks the deletion. > {code} > ContainerExecutor: Couldn't delete file > $LOCAL/usercache/$USER/appcache/application_1481785469354_353539/container_1481785469354_353539_01_28/$FILE > - No such file or directory > {code} > This app dir then escape the cleanup. And that's why we always have many app > dirs left there. > solution 1: just ignore the error without breaking in > container-executor.c::delete_path() > solution 2: use a lock to serialize the cleanup of same app dir. > solution 3: backoff and retry on error > Suggestions are welcome. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906665#comment-15906665 ] Kuhu Shukla edited comment on YARN-6315 at 3/12/17 7:36 PM: Some performance numbers from instrumenting the test and profiling it through YourKit on my Macbook Pro. The current patch spends an average of 1900 ms for 10,002 runs (189 micro-seconds per call). An equivalent patch that uses file.isDirectory(), file.exists(), file.length() as shown below takes 2080.8 ms for 10,002 runs (208 micro seconds per call). {code} if ((!file.isDirectory() && file.length() != req.getSize()) || !file.exists()) { ret = false; } else if (dirsHandler != null) { ret = checkLocalResource(rsrc); } {code} was (Author: kshukla): Some performance numbers from instrumenting the test and profiling it through YourKit on my Macbook Pro. The current patch spends an average of 1900 ms for 10,002 runs (189 micro-seconds per call). An equivalent patch that uses file.isDirectory(), file.exists(), file.length() as shown below takes 2080.8 ms for 10,002 runs (0.208 micro seconds per call). {code} if ((!file.isDirectory() && file.length() != req.getSize()) || !file.exists()) { ret = false; } else if (dirsHandler != null) { ret = checkLocalResource(rsrc); } {code} > Improve LocalResourcesTrackerImpl#isResourcePresent to return false for > corrupted files > --- > > Key: YARN-6315 > URL: https://issues.apache.org/jira/browse/YARN-6315 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3, 2.8.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-6315.001.patch > > > We currently check if a resource is present by making sure that the file > exists locally. There can be a case where the LocalizationTracker thinks that > it has the resource if the file exists but with size 0 or less than the > "expected" size of the LocalResource. This JIRA tracks the change to harden > the isResourcePresent call to address that case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906665#comment-15906665 ] Kuhu Shukla commented on YARN-6315: --- Some performance numbers from instrumenting the test and profiling it through YourKit on my Macbook Pro. The current patch spends an average of 1900 ms for 10,002 runs (189 micro-seconds per call). An equivalent patch that uses file.isDirectory(), file.exists(), file.length() as shown below takes 2080.8 ms for 10,002 runs (0.208 micro seconds per call). {code} if ((!file.isDirectory() && file.length() != req.getSize()) || !file.exists()) { ret = false; } else if (dirsHandler != null) { ret = checkLocalResource(rsrc); } {code} > Improve LocalResourcesTrackerImpl#isResourcePresent to return false for > corrupted files > --- > > Key: YARN-6315 > URL: https://issues.apache.org/jira/browse/YARN-6315 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3, 2.8.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-6315.001.patch > > > We currently check if a resource is present by making sure that the file > exists locally. There can be a case where the LocalizationTracker thinks that > it has the resource if the file exists but with size 0 or less than the > "expected" size of the LocalResource. This JIRA tracks the change to harden > the isResourcePresent call to address that case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906636#comment-15906636 ] Hadoop QA commented on YARN-6315: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 5 new + 33 unchanged - 1 fixed = 38 total (was 34) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6315 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12857528/YARN-6315.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e707f5b2ce67 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4db9cc7 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/15239/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15239/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15239/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Improve LocalResourcesTrackerImpl#isResourcePresent to return false for > corrupted files > ---
[jira] [Updated] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-6315: -- Attachment: YARN-6315.001.patch First version of the patch that uses readAttributes bulk operation to match the size for resources that are not directories since the size of the directory may not always match up. It maintains the exists() behavior by setting ret= false when file not found exception is thrown. The method also catches IOException to maintain previous behavior/signature. > Improve LocalResourcesTrackerImpl#isResourcePresent to return false for > corrupted files > --- > > Key: YARN-6315 > URL: https://issues.apache.org/jira/browse/YARN-6315 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3, 2.8.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-6315.001.patch > > > We currently check if a resource is present by making sure that the file > exists locally. There can be a case where the LocalizationTracker thinks that > it has the resource if the file exists but with size 0 or less than the > "expected" size of the LocalResource. This JIRA tracks the change to harden > the isResourcePresent call to address that case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6328) Update a spelling mistake
[ https://issues.apache.org/jira/browse/YARN-6328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906592#comment-15906592 ] ASF GitHub Bot commented on YARN-6328: -- GitHub user NJUJYB opened a pull request: https://github.com/apache/hadoop/pull/202 [YARN-6328] Update a spelling mistake JIRA Issue: https://issues.apache.org/jira/browse/YARN-6328 Fix a spelling mistake, doesnt should be doesn't. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NJUJYB/hadoop newbranch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/202.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #202 commit dd36fcd1eb743bcfae47602e28b83dbf239c9295 Author: NJUJYB <131220...@smail.nju.edu.cn> Date: 2017-03-12T17:10:59Z fix a spelling mistake > Update a spelling mistake > - > > Key: YARN-6328 > URL: https://issues.apache.org/jira/browse/YARN-6328 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Jin Yibo >Priority: Trivial > > Update a spelling mistake, doesnt should be doesn't. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6328) Update a spelling mistake
Jin Yibo created YARN-6328: -- Summary: Update a spelling mistake Key: YARN-6328 URL: https://issues.apache.org/jira/browse/YARN-6328 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Reporter: Jin Yibo Priority: Trivial Update a spelling mistake, doesnt should be doesn't. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org