[jira] [Commented] (YARN-1959) Fix headroom calculation in Fair Scheduler

2014-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974467#comment-13974467 ] Jason Lowe commented on YARN-1959: -- Yes, over-reporting of the headroom in the

[jira] [Commented] (YARN-1959) Fix headroom calculation in Fair Scheduler

2014-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974550#comment-13974550 ] Jason Lowe commented on YARN-1959: -- Good point, it would also need to min against the

[jira] [Resolved] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-1966. -- Resolution: Duplicate This is a duplicate of YARN-1269 and related to YARN-1941 and YARN-1951. In any

[jira] [Commented] (YARN-1978) TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes

2014-04-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978982#comment-13978982 ] Jason Lowe commented on YARN-1978: -- I've recently seen this happen on my single-node

[jira] [Commented] (YARN-1978) TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes

2014-04-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979004#comment-13979004 ] Jason Lowe commented on YARN-1978: -- Is calling sched.awaitTermination(10, SECONDS) without

[jira] [Commented] (YARN-1975) Used resources shows escaped html in CapacityScheduler and FairScheduler page

2014-04-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980176#comment-13980176 ] Jason Lowe commented on YARN-1975: -- +1, committing this. Used resources shows escaped

[jira] [Created] (YARN-1981) Nodemanager version is not updated when a node reconnects

2014-04-24 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1981: Summary: Nodemanager version is not updated when a node reconnects Key: YARN-1981 URL: https://issues.apache.org/jira/browse/YARN-1981 Project: Hadoop YARN Issue

[jira] [Updated] (YARN-1981) Nodemanager version is not updated when a node reconnects

2014-04-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1981: - Attachment: YARN-1981.patch Patch that updates the nodemanager version when a node reconnects.

[jira] [Commented] (YARN-1354) Recover applications upon nodemanager restart

2014-04-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980360#comment-13980360 ] Jason Lowe commented on YARN-1354: -- Yes, we can't rely on any active containers to tell us

[jira] [Created] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly

2014-04-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1984: Summary: LeveldbTimelineStore does not handle db exceptions properly Key: YARN-1984 URL: https://issues.apache.org/jira/browse/YARN-1984 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly

2014-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981185#comment-13981185 ] Jason Lowe commented on YARN-1984: -- Ran across this while working with leveldb as part of

[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits

2014-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981296#comment-13981296 ] Jason Lowe commented on YARN-1985: -- Do you have the relevant portions of the RM log for

[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits

2014-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981312#comment-13981312 ] Jason Lowe commented on YARN-1985: -- There are only three states for a container: NEW,

[jira] [Commented] (YARN-1985) YARN issues wrong state when running beyond virtual memory limits

2014-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981497#comment-13981497 ] Jason Lowe commented on YARN-1985: -- The exit status should be whatever exit status came

[jira] [Created] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-04-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1987: Summary: Wrapper for leveldb DBIterator to aid in handling database exceptions Key: YARN-1987 URL: https://issues.apache.org/jira/browse/YARN-1987 Project: Hadoop YARN

[jira] [Updated] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-04-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1987: - Attachment: YARN-1987.patch Wrapper for leveldb DBIterator to aid in handling database exceptions

[jira] [Updated] (YARN-1362) Distinguish between nodemanager shutdown for decommission vs shutdown for restart

2014-04-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1362: - Attachment: YARN-1362.patch Small patch that enhances the NM context that provides get/set for a decomm

[jira] [Commented] (YARN-2002) Support for passing Job priority through Application Submission Context in Mapreduce Side

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985485#comment-13985485 ] Jason Lowe commented on YARN-2002: -- Moving this to MAPREDUCE since that's where the

[jira] [Updated] (YARN-2002) Support for passing Job priority through Application Submission Context in Mapreduce Side

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2002: - Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1963) Support for passing Job

[jira] [Created] (YARN-2005) Blacklisting support for scheduling AMs

2014-04-30 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2005: Summary: Blacklisting support for scheduling AMs Key: YARN-2005 URL: https://issues.apache.org/jira/browse/YARN-2005 Project: Hadoop YARN Issue Type: Improvement

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985575#comment-13985575 ] Jason Lowe commented on YARN-2005: -- This is particularly helpful on a busy cluster where

[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v3-and-YARN-1987.patch Updating the patch to address the DBException handling that

[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v4-and-YARN-1987.patch Updating the patch to address the DBException handling that

[jira] [Commented] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985977#comment-13985977 ] Jason Lowe commented on YARN-1987: -- Thanks for the feedback, Ming! bq.

[jira] [Updated] (YARN-1342) Recover container tokens upon nodemanager restart

2014-04-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1342: - Attachment: YARN-1342v3-and-YARN-1987.patch Updating the patch to address the DBException handling that

[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-05-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v3-and-YARN-1987.patch Updating the patch to address the DBException handling that

[jira] [Updated] (YARN-1354) Recover applications upon nodemanager restart

2014-05-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1354: - Attachment: YARN-1354-v2-and-YARN-1987-and-YARN-1362.patch Updating the patch to address the DBException

[jira] [Updated] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-05-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1987: - Attachment: YARN-1987v2.patch Updated the patch to add the Evolving annotation. Wrapper for leveldb

[jira] [Created] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early

2014-05-12 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2046: Summary: Out of band heartbeats are sent only on container kill and possibly too early Key: YARN-2046 URL: https://issues.apache.org/jira/browse/YARN-2046 Project: Hadoop

[jira] [Commented] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early

2014-05-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995147#comment-13995147 ] Jason Lowe commented on YARN-2046: -- We should consider sending out of band heartbeats

[jira] [Resolved] (YARN-2040) Recover information about finished containers

2014-05-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-2040. -- Resolution: Duplicate This will be covered by YARN-1337. Recover information about finished

[jira] [Commented] (YARN-1515) Ability to dump the container threads and stop the containers in a single RPC

2014-05-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993762#comment-13993762 ] Jason Lowe commented on YARN-1515: -- I apologize for the long delay in reviewing and

[jira] [Commented] (YARN-182) Unnecessary Container killed by the ApplicationMaster message for successful containers

2014-05-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996398#comment-13996398 ] Jason Lowe commented on YARN-182: - bq. In my case the reducers were moved to COMPLETED state

[jira] [Commented] (YARN-1362) Distinguish between nodemanager shutdown for decommission vs shutdown for restart

2014-05-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996395#comment-13996395 ] Jason Lowe commented on YARN-1362: -- Yes, that's the intended behavior. If ops is shutting

[jira] [Commented] (YARN-1751) Improve MiniYarnCluster for log aggregation testing

2014-05-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996587#comment-13996587 ] Jason Lowe commented on YARN-1751: -- +1, committing this. Improve MiniYarnCluster for log

[jira] [Updated] (YARN-1337) Recover containers upon nodemanager restart

2014-05-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1337: - Description: To support work-preserving NM restart we need to recover the state of the containers when

[jira] [Commented] (YARN-2014) Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9

2014-05-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996965#comment-13996965 ] Jason Lowe commented on YARN-2014: -- HADOOP-7549 added service loading of filesystems, and

[jira] [Commented] (YARN-2014) Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9

2014-05-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996494#comment-13996494 ] Jason Lowe commented on YARN-2014: -- I did a bit of investigation on this, and the problem

[jira] [Created] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

2014-05-15 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2034: Summary: Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect Key: YARN-2034 URL: https://issues.apache.org/jira/browse/YARN-2034 Project: Hadoop

[jira] [Commented] (YARN-1751) Improve MiniYarnCluster and LogCLIHelpers for log aggregation testing

2014-05-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993565#comment-13993565 ] Jason Lowe commented on YARN-1751: -- Despite them both being small changes, I think these

[jira] [Updated] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

2014-05-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2034: - Description: The description in yarn-default.xml for yarn.nodemanager.localizer.cache.target-size-mb says

[jira] [Commented] (YARN-2050) Fix LogCLIHelpers to create the correct FileContext

2014-05-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996685#comment-13996685 ] Jason Lowe commented on YARN-2050: -- bq. remoteAppLogDir.toUri().getScheme() returns null

[jira] [Commented] (YARN-182) Unnecessary Container killed by the ApplicationMaster message for successful containers

2014-05-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995087#comment-13995087 ] Jason Lowe commented on YARN-182: - I don't believe this is related to YARN-903, rather it

[jira] [Commented] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993020#comment-13993020 ] Jason Lowe commented on YARN-2034: -- While updating it we may also want to clarify that it

[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v4.patch Updating patch to trunk. Recover localized resource cache state upon

[jira] [Updated] (YARN-1354) Recover applications upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1354: - Attachment: YARN-1354-v3.patch Updated patch now that YARN-1987 and YARN-1362 have been committed.

[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994006#comment-13994006 ] Jason Lowe commented on YARN-1962: -- +1 lgtm. Will commit this early next week to give

[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v4.patch Updated patch now that YARN-1987 has been committed. Recover

[jira] [Created] (YARN-2079) Recover NonAggregatingLogHandler state upon nodemanager restart

2014-05-20 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2079: Summary: Recover NonAggregatingLogHandler state upon nodemanager restart Key: YARN-2079 URL: https://issues.apache.org/jira/browse/YARN-2079 Project: Hadoop YARN

[jira] [Commented] (YARN-2050) Fix LogCLIHelpers to create the correct FileContext

2014-05-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003548#comment-14003548 ] Jason Lowe commented on YARN-2050: -- +1 lgtm. Committing this. Fix LogCLIHelpers to

[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v5.patch Thanks for the review, Junping! Attaching a patch to address your comments

[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v6.patch Thanks for the additional comments, Junping. bq. Do we have any code to

[jira] [Commented] (YARN-1801) NPE in public localizer

2014-05-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011601#comment-14011601 ] Jason Lowe commented on YARN-1801: -- Strictly speaking, the patch does prevent the NPE.

[jira] [Created] (YARN-2114) Inform container of container-specific local directories

2014-05-29 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2114: Summary: Inform container of container-specific local directories Key: YARN-2114 URL: https://issues.apache.org/jira/browse/YARN-2114 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-2114) Inform container of container-specific local directories

2014-05-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012544#comment-14012544 ] Jason Lowe commented on YARN-2114: -- Currently a container can obtain a list of local

[jira] [Commented] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active

2014-06-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025314#comment-14025314 ] Jason Lowe commented on YARN-1424: -- -1 was chosen to explicitly mark the field as having

[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-06-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v5.patch Refreshed patch to trunk. Recover DeletionService state upon nodemanager

[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-06-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v6.patch Thanks for the review, Junping! bq. Shall we add if

[jira] [Created] (YARN-2147) client lacks delegation token exception details when application submit fails

2014-06-11 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2147: Summary: client lacks delegation token exception details when application submit fails Key: YARN-2147 URL: https://issues.apache.org/jira/browse/YARN-2147 Project: Hadoop

[jira] [Commented] (YARN-2147) client lacks delegation token exception details when application submit fails

2014-06-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027979#comment-14027979 ] Jason Lowe commented on YARN-2147: -- For example, here's a sample log from a client

[jira] [Updated] (YARN-853) maximum-am-resource-percent doesn't work after refreshQueues command

2014-06-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-853: Fix Version/s: 0.23.11 Thanks, Deveraj! I committed this to branch-0.23 as well.

[jira] [Commented] (YARN-2167) LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block

2014-06-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033127#comment-14033127 ] Jason Lowe commented on YARN-2167: -- +1 pending Jenkins. LeveldbIterator should get

[jira] [Commented] (YARN-2167) LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block

2014-06-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033328#comment-14033328 ] Jason Lowe commented on YARN-2167: -- +1 lgtm. Committing this. LeveldbIterator should

[jira] [Created] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2171: Summary: AMs block on the CapacityScheduler lock during allocate() Key: YARN-2171 URL: https://issues.apache.org/jira/browse/YARN-2171 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033864#comment-14033864 ] Jason Lowe commented on YARN-2171: -- When the CapacityScheduler scheduler thread is running

[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2171: - Attachment: YARN-2171.patch Patch to use AtomicInteger for the number of nodes so we can avoid grabbing

[jira] [Updated] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler

2014-06-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-365: Attachment: YARN-365.branch-0.23.patch Patch for branch-0.23. RM unit tests pass, and I manually tested it

[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2171: - Attachment: YARN-2171v2.patch The point of the unit test was to catch regressions at a high level. If

[jira] [Created] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-17 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2176: Summary: CapacityScheduler loops over all running applications rather than actively requesting apps Key: YARN-2176 URL: https://issues.apache.org/jira/browse/YARN-2176

[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v5.patch Thanks for taking a look, Junping! I've updated the patch to trunk.

[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035799#comment-14035799 ] Jason Lowe commented on YARN-2176: -- AppSchedulingInfo is already determining when an app

[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035905#comment-14035905 ] Jason Lowe commented on YARN-2176: -- That proposal would work for the deactivate path, but

[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036093#comment-14036093 ] Jason Lowe commented on YARN-2176: -- Sure, that works if we think that's cleaner. It's a

[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036451#comment-14036451 ] Jason Lowe commented on YARN-2176: -- ActiveUsersManager doesn't have a reference to the

[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v6.patch Thanks for reviewing, Junping! bq. The change in

[jira] [Commented] (YARN-2175) Container localization has no timeouts and tasks can be stuck there for a long time

2014-06-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037337#comment-14037337 ] Jason Lowe commented on YARN-2175: -- I also wonder if there's been a regression, since at

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039138#comment-14039138 ] Jason Lowe commented on YARN-1341: -- bq. The worst case seems to me is: NM restart with

[jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps

2014-06-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039143#comment-14039143 ] Jason Lowe commented on YARN-2176: -- Ah, yes. AppSchedulingInfo should only be created by

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039235#comment-14039235 ] Jason Lowe commented on YARN-1341: -- bq. Application state - If we failed to store the

[jira] [Created] (YARN-2185) Use pipes when localizing archives

2014-06-20 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2185: Summary: Use pipes when localizing archives Key: YARN-2185 URL: https://issues.apache.org/jira/browse/YARN-2185 Project: Hadoop YARN Issue Type: Improvement

[jira] [Created] (YARN-2202) Metrics recovery for nodemanager restart

2014-06-24 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2202: Summary: Metrics recovery for nodemanager restart Key: YARN-2202 URL: https://issues.apache.org/jira/browse/YARN-2202 Project: Hadoop YARN Issue Type: Sub-task

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042559#comment-14042559 ] Jason Lowe commented on YARN-1341: -- bq. So far from I know, RM restart didn't track this

[jira] [Resolved] (YARN-2210) resource manager fails to start if core-site.xml contains an xi:include

2014-06-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-2210. -- Resolution: Duplicate Resolving as a dup of YARN-1741, as that has more discussion around how this was

[jira] [Updated] (YARN-1741) XInclude support broken for YARN ResourceManager

2014-06-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1741: - Priority: Critical (was: Minor) Bumping the priority of this based on YARN-2210 and the fact that

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045959#comment-14045959 ] Jason Lowe commented on YARN-1341: -- Agree it's not ideal to discuss handling state store

[jira] [Commented] (YARN-2104) Scheduler queue filter failed to work because index of queue column changed

2014-06-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046512#comment-14046512 ] Jason Lowe commented on YARN-2104: -- +1 lgtm. The test failure is unrelated. Committing

[jira] [Commented] (YARN-2263) CSQueueUtils.computeMaxActiveApplicationsPerUser may cause deadlock for nested MapReduce jobs

2014-07-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058132#comment-14058132 ] Jason Lowe commented on YARN-2263: -- 1 is an appropriate lower bound since we don't ever

[jira] [Commented] (YARN-2259) NM-Local dir cleanup failing when Resourcemanager switches

2014-07-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058200#comment-14058200 ] Jason Lowe commented on YARN-2259: -- This sounds like the NM wasn't notified of the

[jira] [Commented] (YARN-1421) Node managers will not receive application finish event where containers ran before RM restart

2014-07-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058201#comment-14058201 ] Jason Lowe commented on YARN-1421: -- Was this fixed by YARN-1885? Node managers will not

[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned

2014-07-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058952#comment-14058952 ] Jason Lowe commented on YARN-2045: -- Thanks for the patch, Junping! Is the schema version

[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned

2014-07-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060694#comment-14060694 ] Jason Lowe commented on YARN-2045: -- bq. If any incompatible changes happen (not matter

[jira] [Commented] (YARN-2152) Recover missing container information

2014-07-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061354#comment-14061354 ] Jason Lowe commented on YARN-2152: -- I think this may have broken backwards compatibility

[jira] [Commented] (YARN-2152) Recover missing container information

2014-07-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062060#comment-14062060 ] Jason Lowe commented on YARN-2152: -- Yeah that's what I suspected as well, but I wanted to

[jira] [Commented] (YARN-2045) Data persisted in NM should be versioned

2014-07-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062212#comment-14062212 ] Jason Lowe commented on YARN-2045: -- bq. I agree the concept is not quite the same but I

[jira] [Commented] (YARN-2293) Scoring for NMs to identify a better candidate to launch AMs

2014-07-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062530#comment-14062530 ] Jason Lowe commented on YARN-2293: -- This sounds very similar to YARN-2005, if a bit more

[jira] [Updated] (YARN-1336) Work-preserving nodemanager restart

2014-07-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1336: - Attachment: NMRestartDesignOverview.pdf Attaching a PDF that briefly describes the approach and how the

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062861#comment-14062861 ] Jason Lowe commented on YARN-1341: -- Thanks for commenting, Devaraj! My apologies for the

[jira] [Moved] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411

2013-09-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved MAPREDUCE-5539 to YARN-1243: - Component/s: (was: capacity-sched)

[jira] [Updated] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411

2013-09-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1243: - Attachment: YARN-1243.branch-0.23.patch Patch that backports YARN-845 fix to branch-0.23.

[jira] [Commented] (YARN-1243) ResourceManager: Error in handling event type NODE_UPDATE to the scheduler - NPE at SchedulerApp.java:411

2013-09-26 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778921#comment-13778921 ] Jason Lowe commented on YARN-1243: -- Jenkins only handles trunk patches. I manually ran

<    1   2   3   4   5   6   7   8   9   10   >