[jira] [Commented] (YARN-6191) CapacityScheduler preemption by container priority can be problematic for MapReduce

2017-02-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866746#comment-15866746 ] Jason Lowe commented on YARN-6191: -- This is similar to the FairScheduler problem described in YARN-3054.

[jira] [Created] (YARN-6191) CapacityScheduler preemption by container priority can be problematic for MapReduce

2017-02-14 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6191: Summary: CapacityScheduler preemption by container priority can be problematic for MapReduce Key: YARN-6191 URL: https://issues.apache.org/jira/browse/YARN-6191 Project:

[jira] [Commented] (YARN-5501) Container Pooling in YARN

2017-02-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859587#comment-15859587 ] Jason Lowe commented on YARN-5501: -- Thanks for the detailed answers. I highly recommend these get

[jira] [Commented] (YARN-5501) Container Pooling in YARN

2017-02-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858624#comment-15858624 ] Jason Lowe commented on YARN-5501: -- bq. As part of the detachContainer all the resources associated with

[jira] [Commented] (YARN-6137) Yarn client implicitly invoke ATS client which accesses HDFS

2017-02-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858535#comment-15858535 ] Jason Lowe commented on YARN-6137: -- +1 lgtm. Committing this. > Yarn client implicitly invoke ATS client

[jira] [Commented] (YARN-5501) Container Pooling in YARN

2017-02-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858174#comment-15858174 ] Jason Lowe commented on YARN-5501: -- Thanks for posting the design document! I am confused on how this

[jira] [Commented] (YARN-6137) Yarn client implicitly invoke ATS client which accesses HDFS

2017-02-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854969#comment-15854969 ] Jason Lowe commented on YARN-6137: -- Thanks for the patch! I think overall the approach is reasonable.

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-02-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854295#comment-15854295 ] Jason Lowe commented on YARN-6125: -- Seems fine to me. > The application attempt's diagnostic message

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-02-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851732#comment-15851732 ] Jason Lowe commented on YARN-6125: -- I would argue the most important part of many stacktraces isn't the

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-02-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851674#comment-15851674 ] Jason Lowe commented on YARN-6125: -- I suppose we could do ellipses, but I'm still a bit confused. Are we

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-02-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851556#comment-15851556 ] Jason Lowe commented on YARN-6125: -- I think it's not necessary to distinguish between message truncation

[jira] [Commented] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842996#comment-15842996 ] Jason Lowe commented on YARN-5641: -- +1 for the latest branch-2 patch. I'll fixup the extra import during

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-01-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840473#comment-15840473 ] Jason Lowe commented on YARN-6125: -- So I assume the tail end of the diagnostics would have been just as

[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-01-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840265#comment-15840265 ] Jason Lowe commented on YARN-6125: -- For the huge examples that have been encountered so far, what would

[jira] [Reopened] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened YARN-5641: -- Reopening to run Jenkins on the branch-2 patch. > Localizer leaves behind tarballs after container is

[jira] [Commented] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840052#comment-15840052 ] Jason Lowe commented on YARN-5641: -- The branch-2 patch looks reasonable to me. Nit: the new isAlive

[jira] [Updated] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5641: - Fix Version/s: (was: 2.9.0) My apologies, I missed the isAlive method calls on Process which are only

[jira] [Commented] (YARN-574) PrivateLocalizer does not support parallel resource download via ContainerLocalizer

2017-01-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838706#comment-15838706 ] Jason Lowe commented on YARN-574: - Thanks for updating the patch! I don't think this while loop is desired:

[jira] [Commented] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838502#comment-15838502 ] Jason Lowe commented on YARN-5641: -- +1 for the latest patch. The unit test failures are unrelated.

[jira] [Commented] (YARN-5617) AMs only intended to run one attempt can be run more than once

2017-01-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836835#comment-15836835 ] Jason Lowe commented on YARN-5617: -- It would be good to get some feedback on this proposal. Also pinging

[jira] [Commented] (YARN-3053) [Security] Review and implement authentication in ATS v.2

2017-01-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834671#comment-15834671 ] Jason Lowe commented on YARN-3053: -- bq. Tokens will be renewed by YARN i.e. by collector manager at each

[jira] [Commented] (YARN-3053) [Security] Review and implement security in ATS v.2

2017-01-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832407#comment-15832407 ] Jason Lowe commented on YARN-3053: -- Sorry to jump in relatively late on this. A couple of questions came

[jira] [Commented] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832071#comment-15832071 ] Jason Lowe commented on YARN-5641: -- We should not be using org.eclipse.jetty.util.ConcurrentHashSet.

[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2017-01-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832050#comment-15832050 ] Jason Lowe commented on YARN-5547: -- Thanks for updating the patch! We're still storing a redundant killed

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2017-01-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832014#comment-15832014 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! Nit: I think it should be more clear

[jira] [Commented] (YARN-574) PrivateLocalizer does not support parallel resource download via ContainerLocalizer

2017-01-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831792#comment-15831792 ] Jason Lowe commented on YARN-574: - No, that's still racy. There's a window where the worker thread has

[jira] [Commented] (YARN-5641) Localizer leaves behind tarballs after container is complete

2017-01-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830718#comment-15830718 ] Jason Lowe commented on YARN-5641: -- Thanks for updating the patch! I think getThread and the description

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2017-01-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830535#comment-15830535 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! Last I knew, the descriptions for

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2017-01-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828831#comment-15828831 ] Jason Lowe commented on YARN-5910: -- bq. Regarding the if security enabled check in ClientRMSerivce, do

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2017-01-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828772#comment-15828772 ] Jason Lowe commented on YARN-5910: -- bq. Currently, the RM DelegationTokenRewener will only add the tokens

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2017-01-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828243#comment-15828243 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! It's confusing to see a

[jira] [Updated] (YARN-5617) AMs only intended to run one attempt can be run more than once

2017-01-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5617: - Attachment: YARN-5617.002.patch Minor tweaks to fix the checkstyle issues. > AMs only intended to run one

[jira] [Updated] (YARN-5617) AMs only intended to run one attempt can be run more than once

2017-01-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5617: - Attachment: YARN-5617.001.patch Here's a patch that implements the "one max attempt really means one

[jira] [Commented] (YARN-574) PrivateLocalizer does not support parallel resource download via ContainerLocalizer

2017-01-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819248#comment-15819248 ] Jason Lowe commented on YARN-574: - Thanks for picking this up [~ajithshetty]. I took a quick look at the

[jira] [Commented] (YARN-5416) TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently due to not wait SchedulerApplicationAttempt to be stopped

2017-01-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818824#comment-15818824 ] Jason Lowe commented on YARN-5416: -- +1 lgtm. I'll fix the unused import checkstyle nits during the

[jira] [Updated] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2017-01-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4148: - Attachment: YARN-4148-branch-2.8.003.patch Attaching the patch for branch-2.8. > When killing app, RM

[jira] [Commented] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2017-01-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813092#comment-15813092 ] Jason Lowe commented on YARN-4148: -- The unit test failures appear to be unrelated. They pass for me

[jira] [Updated] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2017-01-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4990: - Fix Version/s: (was: 2.9.0) 2.8.0 Thanks, Xuan! I committed this to branch-2.8 as

[jira] [Updated] (YARN-5246) NMWebAppFilter web redirects drop query parameters

2017-01-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5246: - Fix Version/s: (was: 2.9.0) 2.8.0 Thanks, Varun! I committed this to branch-2.8 as

[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2017-01-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15801788#comment-15801788 ] Jason Lowe commented on YARN-5547: -- bq. for deleting the unknown keys, would it be ok to remove unknown

[jira] [Commented] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2017-01-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15801743#comment-15801743 ] Jason Lowe commented on YARN-4990: -- This would be a nice fix to get into 2.8 and seems to be low risk.

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2017-01-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798347#comment-15798347 ] Jason Lowe commented on YARN-2902: -- bq. In trunk we "ignore" a non existing directory in delete_as_user()

[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749453#comment-15749453 ] Jason Lowe commented on YARN-5889: -- I think I can get behind a proposal that preserves the FIFO/priority

[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748766#comment-15748766 ] Jason Lowe commented on YARN-5889: -- +1 for keeping the user limit behavior consistent with what it does

[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15745552#comment-15745552 ] Jason Lowe commented on YARN-5889: -- bq. To solve the problem, we need to compute user limit considering

[jira] [Updated] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2016-12-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4148: - Attachment: YARN-4148.003.patch Updating the patch to cleanup the javadoc and checkstyle issues and fixed

[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715215#comment-15715215 ] Jason Lowe commented on YARN-5889: -- bq. This means that we will be doing same as what we do earlier too

[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2016-12-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713359#comment-15713359 ] Jason Lowe commented on YARN-5910: -- Pinging [~daryn] since I'm sure he has an opinion on this. I'm not

[jira] [Commented] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write

2016-12-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713282#comment-15713282 ] Jason Lowe commented on YARN-5915: -- Sure if the output stream could be unbuffered then having it buffer at

[jira] [Updated] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2016-11-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4148: - Attachment: YARN-4148.002.patch Sorry for the delay. I rebased the patch on trunk and added a unit test.

[jira] [Commented] (YARN-5859) TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails

2016-11-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15684030#comment-15684030 ] Jason Lowe commented on YARN-5859: -- +1 lgtm. Committing this. >

[jira] [Commented] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write

2016-11-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15683683#comment-15683683 ] Jason Lowe commented on YARN-5915: -- Sorry for missing this in YARN-4814. I thought the fix was equivalent

[jira] [Commented] (YARN-5859) TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails

2016-11-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676809#comment-15676809 ] Jason Lowe commented on YARN-5859: -- Thanks for the patch! I noticed there are other rather low timeouts

[jira] [Commented] (YARN-5900) Configuring minimum-allocation-mb at queue level

2016-11-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673838#comment-15673838 ] Jason Lowe commented on YARN-5900: -- I don't believe this would simplify anything, rather just make it more

[jira] [Commented] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671849#comment-15671849 ] Jason Lowe commented on YARN-5836: -- +1 lgtm. Committing this. > Malicious AM can kill containers of

[jira] [Updated] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5836: - Summary: Malicious AM can kill containers of other apps running in any node its containers are running

[jira] [Commented] (YARN-4355) NPE while processing localizer heartbeat

2016-11-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665092#comment-15665092 ] Jason Lowe commented on YARN-4355: -- +1, latest trunk and 2.7 patches look good to me. > NPE while

[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2016-11-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665043#comment-15665043 ] Jason Lowe commented on YARN-5547: -- Thanks for updating the patch! Is there a good reason to store the

[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657124#comment-15657124 ] Jason Lowe commented on YARN-5867: -- If the disk was wiped and re-introduced then this may be more

[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654405#comment-15654405 ] Jason Lowe commented on YARN-5867: -- I'm curious how the top-level local directory was deleted in the first

[jira] [Commented] (YARN-5836) NMToken passwd not checked in ContainerManagerImpl, malicious AM can fake the Token and kill containers of other apps at will

2016-11-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651095#comment-15651095 ] Jason Lowe commented on YARN-5836: -- So have you verified that a faked NM token "works" or was this

[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648923#comment-15648923 ] Jason Lowe commented on YARN-5356: -- Thanks for updating the patch! The unit test failures appear to be

[jira] [Commented] (YARN-5859) TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails

2016-11-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648913#comment-15648913 ] Jason Lowe commented on YARN-5859: -- The test output: {noformat} 2016-11-07 20:00:01,393 INFO [Thread-275]

[jira] [Created] (YARN-5859) TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails

2016-11-08 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-5859: Summary: TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails Key: YARN-5859 URL: https://issues.apache.org/jira/browse/YARN-5859

[jira] [Commented] (YARN-5836) NMToken passwd not checked in ContainerManagerImpl, malicious AM can fake the Token and kill containers of other apps at will

2016-11-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644391#comment-15644391 ] Jason Lowe commented on YARN-5836: -- As I understand it, the NM token should be getting verified by the

[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637606#comment-15637606 ] Jason Lowe commented on YARN-5356: -- Thanks for updating the patch! I don't think

[jira] [Commented] (YARN-5837) NPE when getting node status of a decommissioned node after an RM restart

2016-11-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637465#comment-15637465 ] Jason Lowe commented on YARN-5837: -- Thanks for the patch! My apologies for missing this when reviewing

[jira] [Updated] (YARN-5608) TestAMRMClient.setup() fails with ArrayOutOfBoundsException

2016-11-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5608: - Fix Version/s: (was: 2.9.0) 2.8.0 Thanks [~templedf]! I committed this to

[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15632892#comment-15632892 ] Jason Lowe commented on YARN-5356: -- Thanks for the patch, [~elgoiri]! Looks good overall, but I wonder

[jira] [Commented] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-11-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15629078#comment-15629078 ] Jason Lowe commented on YARN-4862: -- I don't think we need to worry too much about optimizing the case

[jira] [Commented] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626639#comment-15626639 ] Jason Lowe commented on YARN-4862: -- Thanks for the update! Containers were removed from the test but the

[jira] [Commented] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626571#comment-15626571 ] Jason Lowe commented on YARN-5001: -- +1 lgtm. Committing this. > Aggregated Logs root directory is

[jira] [Commented] (YARN-5368) memory leak at timeline server

2016-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626529#comment-15626529 ] Jason Lowe commented on YARN-5368: -- bq. Recently I noticed same issue with NodeManger when recovery is

[jira] [Commented] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-10-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623085#comment-15623085 ] Jason Lowe commented on YARN-5001: -- Thanks for updating the patch! Looks good with one nit: we don't need

[jira] [Commented] (YARN-65) Reduce RM app memory footprint once app has completed

2016-10-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622866#comment-15622866 ] Jason Lowe commented on YARN-65: I believe the request is still valid. This isn't so much about adjusting

[jira] [Commented] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-10-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622643#comment-15622643 ] Jason Lowe commented on YARN-4862: -- Getting close. The javadoc nit needs to be fixed. I assume the test

[jira] [Commented] (YARN-4857) Add missing default configuration regarding preemption of CapacityScheduler

2016-10-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622617#comment-15622617 ] Jason Lowe commented on YARN-4857: -- Update looks good, just need to clean up the new warnings caused by

[jira] [Commented] (YARN-4857) Add missing default configuration regarding preemption of CapacityScheduler

2016-10-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615897#comment-15615897 ] Jason Lowe commented on YARN-4857: -- Thanks for updating the patch! It looks OK except I noticed we added

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613526#comment-15613526 ] Jason Lowe commented on YARN-4963: -- +1 lgtm. Will commit this tomorrow if there are no objections. >

[jira] [Updated] (YARN-4467) Shell.checkIsBashSupported swallowed an interrupted exception

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4467: - Target Version/s: 2.8.0 Priority: Blocker (was: Major) +1, kicked Jenkins again to get a

[jira] [Commented] (YARN-5027) NM should clean up app log dirs after NM restart

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613298#comment-15613298 ] Jason Lowe commented on YARN-5027: -- +1 for the latest patch. I'll commit this tomorrow if there are no

[jira] [Commented] (YARN-5027) NM should clean up app log dirs after NM restart

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613285#comment-15613285 ] Jason Lowe commented on YARN-5027: -- I don't believe it will leak those directories. The exists check is

[jira] [Commented] (YARN-5027) NM should clean up app log dirs after NM restart

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613182#comment-15613182 ] Jason Lowe commented on YARN-5027: -- Thanks for the patch! Looks good to me, kicking another Jenkins run

[jira] [Commented] (YARN-4831) Recovered containers will be killed after NM stateful restart

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613149#comment-15613149 ] Jason Lowe commented on YARN-4831: -- The unit test failure is unrelated, and the test passes for me locally

[jira] [Commented] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613068#comment-15613068 ] Jason Lowe commented on YARN-5001: -- What if the user has no primary group? In a secure environment the

[jira] [Commented] (YARN-5172) Update yarn daemonlog documentation due to HADOOP-12847

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612963#comment-15612963 ] Jason Lowe commented on YARN-5172: -- +1 lgtm. Committing this. > Update yarn daemonlog documentation due

[jira] [Commented] (YARN-4668) Reuse objectMapper instance in Yarn

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612905#comment-15612905 ] Jason Lowe commented on YARN-4668: -- Thanks for the patch, [~linyiqun]! Apologies in the delay for review.

[jira] [Updated] (YARN-5177) Make Node-Manager Download-Resource Component extensible.

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5177: - Assignee: Emeka Labels: oct16-medium (was: ) > Make Node-Manager Download-Resource Component

[jira] [Updated] (YARN-5172) Update yarn daemonlog documentation due to HADOOP-12847

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5172: - Labels: oct16-easy (was: ) > Update yarn daemonlog documentation due to HADOOP-12847 >

[jira] [Updated] (YARN-5153) [YARN-3368] Add a toggle to switch timeline view / table view for containers information inside application-attempt page

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5153: - Labels: oct16-easy (was: ) Component/s: webapp > [YARN-3368] Add a toggle to switch timeline

[jira] [Updated] (YARN-5152) [YARN-3368] Avoid reloading web page after selecting different queues in scheduler page

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5152: - Labels: oct16-medium (was: ) Component/s: webapp > [YARN-3368] Avoid reloading web page after

[jira] [Updated] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4858: - Labels: oct16-easy (was: ) > start-yarn and stop-yarn scripts to support timeline and sharedcachemanager

[jira] [Updated] (YARN-4896) ProportionalPreemptionPolicy needs to handle AMResourcePercentage per partition

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4896: - Labels: oct16-easy (was: ) > ProportionalPreemptionPolicy needs to handle AMResourcePercentage per >

[jira] [Updated] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4882: - Labels: oct16-easy (was: ) > Change the log level to DEBUG for recovering completed applications >

[jira] [Updated] (YARN-4876) Decoupled Init / Destroy of Containers from Start / Stop

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4876: - Labels: oct16-hard (was: ) Component/s: nodemanager api > Decoupled Init /

[jira] [Commented] (YARN-4831) Recovered containers will be killed after NM stateful restart

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612685#comment-15612685 ] Jason Lowe commented on YARN-4831: -- Sorry for the long delay in reviewing. +1 lgtm. Will commit this

[jira] [Commented] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612520#comment-15612520 ] Jason Lowe commented on YARN-4862: -- +1 latest patch lgtm. I can cleanup the whitespace and checkstyle

[jira] [Resolved] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-1468. -- Resolution: Duplicate Closing this as a duplicate of YARN-5416 since that other JIRA has a proposed

[jira] [Commented] (YARN-5767) Fix the order that resources are cleaned up from the local Public/Private caches

2016-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612061#comment-15612061 ] Jason Lowe commented on YARN-5767: -- +1 lgtm. I'll commit this tomorrow if there are no objections. > Fix

[jira] [Commented] (YARN-5767) Fix the order that resources are cleaned up from the local Public/Private caches

2016-10-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609727#comment-15609727 ] Jason Lowe commented on YARN-5767: -- Ah, I missed the findbugs connection with respect to comparator

<    6   7   8   9   10   11   12   13   14   15   >