[jira] [Commented] (YARN-4728) MapReduce job doesn't make any progress for a very very long time after one Node become unusable.

2016-02-23 Thread Varun Saxena (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160343#comment-15160343 ] Varun Saxena commented on YARN-4728: [~Silnov], in addition to above, can you check your AM logs and

[jira] [Commented] (YARN-4728) MapReduce job doesn't make any progress for a very very long time after one Node become unusable.

2016-02-23 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160305#comment-15160305 ] zhihai xu commented on YARN-4728: - Thanks for reporting this issue [~Silnov]! It looks like this issue is

[jira] [Created] (YARN-4730) YARN preemption based on instantaneous fair share

2016-02-23 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-4730: --- Summary: YARN preemption based on instantaneous fair share Key: YARN-4730 URL: https://issues.apache.org/jira/browse/YARN-4730 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-4630) Remove useless boxing/unboxing code (Hadoop YARN)

2016-02-23 Thread Akira AJISAKA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160246#comment-15160246 ] Akira AJISAKA commented on YARN-4630: - Kicked QA test manually. +1 pending Jenkins. > Remove useless

[jira] [Updated] (YARN-4484) Available Resource calculation for a queue is not correct when used with labels

2016-02-23 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4484: -- Attachment: 0003-YARN-4484.patch Thank you very much [~leftnoteasy] for helping in clarifying the comments.

[jira] [Commented] (YARN-4722) AsyncDispatcher logs redundant event queue sizes

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160208#comment-15160208 ] Hadoop QA commented on YARN-4722: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4624) NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

2016-02-23 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160198#comment-15160198 ] Sunil G commented on YARN-4624: --- Uploaded a screen shot from my test cluster where I removed label-mapping

[jira] [Updated] (YARN-4624) NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

2016-02-23 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4624: -- Attachment: SchedulerUIWithOutLabelMapping.png > NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Wilfred Spiegelenburg (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160196#comment-15160196 ] Wilfred Spiegelenburg commented on YARN-4697: - +1 LGTM > NM aggregation thread pool is not

[jira] [Commented] (YARN-4624) NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

2016-02-23 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160195#comment-15160195 ] Sunil G commented on YARN-4624: --- Hi [~brahmareddy] Thanks for updating the patch. Latest patch looks good for

[jira] [Updated] (YARN-4729) SchedulerApplicationAttempt#getTotalRequiredResources can throw an NPE

2016-02-23 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4729: --- Attachment: yarn-4729.patch Straight-forward patch. >

[jira] [Created] (YARN-4729) SchedulerApplicationAttempt#getTotalRequiredResources can throw an NPE

2016-02-23 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-4729: -- Summary: SchedulerApplicationAttempt#getTotalRequiredResources can throw an NPE Key: YARN-4729 URL: https://issues.apache.org/jira/browse/YARN-4729 Project:

[jira] [Commented] (YARN-4651) movetoqueue option does not documented in 'YARN Commands'

2016-02-23 Thread Takashi Ohnishi (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160154#comment-15160154 ] Takashi Ohnishi commented on YARN-4651: --- Thank you, [~rohithsharma] for reviewing and committing !!

[jira] [Created] (YARN-4728) MapReduce job doesn't make any progress for a very very long time after one Node become unusable.

2016-02-23 Thread Silnov (JIRA)
Silnov created YARN-4728: Summary: MapReduce job doesn't make any progress for a very very long time after one Node become unusable. Key: YARN-4728 URL: https://issues.apache.org/jira/browse/YARN-4728

[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160088#comment-15160088 ] Hadoop QA commented on YARN-4108: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-1489) [Umbrella] Work-preserving ApplicationMaster restart

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160061#comment-15160061 ] Vinod Kumar Vavilapalli commented on YARN-1489: --- That and the "Old running containers don't

[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader

2016-02-23 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160053#comment-15160053 ] Sangjin Lee commented on YARN-3863: --- Just to clarify my mental model, I am trying to view the logic as

[jira] [Resolved] (YARN-2736) Job.getHistoryUrl returns empty string

2016-02-23 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter resolved YARN-2736. - Resolution: Fixed > Job.getHistoryUrl returns empty string >

[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader

2016-02-23 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160034#comment-15160034 ] Sangjin Lee commented on YARN-3863: --- If I'm reading this right, the key changes seem to be in

[jira] [Updated] (YARN-4727) Unable to override the $HADOOP_CONF_DIR env variable for container

2016-02-23 Thread Terence Yim (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terence Yim updated YARN-4727: -- Description: Given the default config of "yarn.nodemanager.env-whitelist", application should be able

[jira] [Created] (YARN-4727) Unable to override the $HADOOP_CONF_DIR env variable for container

2016-02-23 Thread Terence Yim (JIRA)
Terence Yim created YARN-4727: - Summary: Unable to override the $HADOOP_CONF_DIR env variable for container Key: YARN-4727 URL: https://issues.apache.org/jira/browse/YARN-4727 Project: Hadoop YARN

[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader

2016-02-23 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159986#comment-15159986 ] Sangjin Lee commented on YARN-3863: --- Sorry for taking a long time to review this [~varun_saxena]. I've

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159913#comment-15159913 ] Li Lu commented on YARN-4705: - Once the WARN log is gone I'm fine with it. One quick question is, are we

[jira] [Commented] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159891#comment-15159891 ] Vinod Kumar Vavilapalli commented on YARN-3998: --- [~vvasudev], [~hex108] bq. Unification with

[jira] [Updated] (YARN-1292) De-link container life cycle from the process it runs

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1292: -- Issue Type: Sub-task (was: Improvement) Parent: YARN-4726 > De-link

[jira] [Commented] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-02-23 Thread Arun Suresh (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159810#comment-15159810 ] Arun Suresh commented on YARN-3998: --- By the way, Thanks a ton for raising this [~hex108].. Extremely

[jira] [Commented] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-02-23 Thread Arun Suresh (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159791#comment-15159791 ] Arun Suresh commented on YARN-3998: --- Was spending some time thinking about this.. Would it make sense to

[jira] [Updated] (YARN-1079) Fix progress bar for long-lived services in YARN

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1079: -- Issue Type: Sub-task (was: Bug) Parent: YARN-4724 > Fix progress bar

[jira] [Updated] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1040: -- Parent Issue: YARN-4726 (was: YARN-4692) > De-link container life cycle from

[jira] [Updated] (YARN-3417) AM to be able to exit with a request saying "restart me with these (possibly updated) resource requirements"

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3417: -- Parent Issue: YARN-4726 (was: YARN-4692) > AM to be able to exit with a request

[jira] [Created] (YARN-4726) [Umbrella] Allocation reuse for application upgrades

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-4726: - Summary: [Umbrella] Allocation reuse for application upgrades Key: YARN-4726 URL: https://issues.apache.org/jira/browse/YARN-4726 Project: Hadoop

[jira] [Commented] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159768#comment-15159768 ] Vinod Kumar Vavilapalli commented on YARN-3998: --- Making this a sub-task of YARN-4725 where we

[jira] [Updated] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3998: -- Issue Type: Sub-task (was: New Feature) Parent: YARN-4725 > Add

[jira] [Updated] (YARN-4725) [Umbrella] Auto-­restart of containers

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4725: -- Description: See overview doc at YARN-4692, copying the sub-section to track all

[jira] [Created] (YARN-4725) [Umbrella] Auto-­restart of containers

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-4725: - Summary: [Umbrella] Auto-­restart of containers Key: YARN-4725 URL: https://issues.apache.org/jira/browse/YARN-4725 Project: Hadoop YARN

[jira] [Updated] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1039: -- Parent Issue: YARN-4724 (was: YARN-4692) > Add parameter for YARN resource

[jira] [Created] (YARN-4724) [Umbrella] Recognizing services: Special handling of preemption, container reservations etc.

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-4724: - Summary: [Umbrella] Recognizing services: Special handling of preemption, container reservations etc. Key: YARN-4724 URL:

[jira] [Updated] (YARN-4701) When task logs are not available, port 8041 is referenced instead of port 8042

2016-02-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-4701: - Attachment: yarn4701.004.patch A unit test added to verify the correct http port is displayed when log

[jira] [Commented] (YARN-4511) Create common scheduling policy for resource over-subscription

2016-02-23 Thread Inigo Goiri (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159748#comment-15159748 ] Inigo Goiri commented on YARN-4511: --- Thanks [~leftnoteasy], adding dependency on YARN-4718 and once

[jira] [Updated] (YARN-3417) AM to be able to exit with a request saying "restart me with these (possibly updated) resource requirements"

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3417: -- Issue Type: Sub-task (was: New Feature) Parent: YARN-4692 > AM to be

[jira] [Commented] (YARN-4470) [Umbrella] Application Master in-place upgrade

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159735#comment-15159735 ] Vinod Kumar Vavilapalli commented on YARN-4470: --- Moved this to be a part of YARN-4692 given

[jira] [Updated] (YARN-4470) [Umbrella] Application Master in-place upgrade

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4470: -- Summary: [Umbrella] Application Master in-place upgrade (was: Application

[jira] [Updated] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1040: -- Parent Issue: YARN-4692 (was: YARN-896) > De-link container life cycle from the

[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159729#comment-15159729 ] Hadoop QA commented on YARN-4311: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159730#comment-15159730 ] Vinod Kumar Vavilapalli commented on YARN-1040: --- Moved this to be a sub-task of YARN-4692

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159727#comment-15159727 ] Vinod Kumar Vavilapalli commented on YARN-1039: --- Moved this to be a sub-task of YARN-4692

[jira] [Updated] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1039: -- Parent Issue: YARN-4692 (was: YARN-896) > Add parameter for YARN resource

[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159726#comment-15159726 ] Vinod Kumar Vavilapalli commented on YARN-4692: --- Tx for starting the discussions everyone.

[jira] [Commented] (YARN-4716) TimelineClient to implement Flushable; propagate to writer

2016-02-23 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159717#comment-15159717 ] Li Lu commented on YARN-4716: - bq. I'm expecting the reader to follow its normal refresh/load cycle. Cool.

[jira] [Assigned] (YARN-2889) Limit in the number of queueable container requests per AM

2016-02-23 Thread Arun Suresh (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-2889: - Assignee: Arun Suresh > Limit in the number of queueable container requests per AM >

[jira] [Assigned] (YARN-2886) Estimating waiting time in NM container queues

2016-02-23 Thread Konstantinos Karanasos (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantinos Karanasos reassigned YARN-2886: Assignee: Konstantinos Karanasos > Estimating waiting time in NM container

[jira] [Commented] (YARN-4511) Create common scheduling policy for resource over-subscription

2016-02-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159705#comment-15159705 ] Wangda Tan commented on YARN-4511: -- Hi [~elgoiri] As you mentioned in

[jira] [Updated] (YARN-4511) Create common scheduling policy for resource over-subscription

2016-02-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4511: - Assignee: Inigo Goiri (was: Wangda Tan) > Create common scheduling policy for resource over-subscription

[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-02-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4108: - Attachment: YARN-4108.4.patch Rebased to latest trunk. > CapacityScheduler: Improve preemption to preempt

[jira] [Commented] (YARN-4715) Add support to read resource types from a config file

2016-02-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159667#comment-15159667 ] Wangda Tan commented on YARN-4715: -- Hi [~vvasudev], Thanks for working on the patch, some comments: -

[jira] [Commented] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early

2016-02-23 Thread Ming Ma (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159663#comment-15159663 ] Ming Ma commented on YARN-2046: --- Thanks [~jlowe] and [~xgong]! > Out of band heartbeats are sent only on

[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159648#comment-15159648 ] Hadoop QA commented on YARN-4108: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early

2016-02-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159646#comment-15159646 ] Hudson commented on YARN-2046: -- FAILURE: Integrated in Hadoop-trunk-Commit #9354 (See

[jira] [Commented] (YARN-2046) Out of band heartbeats are sent only on container kill and possibly too early

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159624#comment-15159624 ] Jason Lowe commented on YARN-2046: -- +1 for the 2.7 and 2.6 patches as well. Committing this. > Out of

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159563#comment-15159563 ] Hadoop QA commented on YARN-4697: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-3607) Allow users to choose between failing the daemons vs failing the apps/containers

2016-02-23 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159488#comment-15159488 ] Ray Chiang commented on YARN-3607: -- Two suggestions: 1) Since this is a setting that affects all daemons,

[jira] [Commented] (YARN-4701) When task logs are not available, port 8041 is referenced instead of port 8042

2016-02-23 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159479#comment-15159479 ] Robert Kanter commented on YARN-4701: - Approach looks good. Though can you add a unit test for this

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159446#comment-15159446 ] Robert Kanter commented on YARN-4697: - I guess I just missed that build and hadn't refreshed. I've

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159437#comment-15159437 ] Robert Kanter commented on YARN-4697: - LGTM. +1 pending Jenkins (which I just kicked off because it

[jira] [Assigned] (YARN-3607) Allow users to choose between failing the daemons vs failing the apps/containers

2016-02-23 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang reassigned YARN-3607: Assignee: Ray Chiang > Allow users to choose between failing the daemons vs failing the >

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159411#comment-15159411 ] Hadoop QA commented on YARN-4697: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4722) AsyncDispatcher logs redundant event queue sizes

2016-02-23 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159260#comment-15159260 ] Sangjin Lee commented on YARN-4722: --- +1 pending jenkins. > AsyncDispatcher logs redundant event queue

[jira] [Commented] (YARN-4722) AsyncDispatcher logs redundant event queue sizes

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159258#comment-15159258 ] Jason Lowe commented on YARN-4722: -- This is indeed accessed from multiple threads. It just so happens in

[jira] [Updated] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-4697: - Attachment: yarn4697.004.patch New unit tests added for invalid values. Other comments addressed as well

[jira] [Commented] (YARN-4723) NodesListManager$UnknownNodeId ClassCastException

2016-02-23 Thread Kuhu Shukla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159209#comment-15159209 ] Kuhu Shukla commented on YARN-4723: --- The updatedNodes in NodeReport is picking up the UnknownNodeIds

[jira] [Commented] (YARN-4723) NodesListManager$UnknownNodeId ClassCastException

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159174#comment-15159174 ] Jason Lowe commented on YARN-4723: -- This appears to be related to the change in YARN-3102. [~kshukla]

[jira] [Created] (YARN-4723) NodesListManager$UnknownNodeId ClassCastException

2016-02-23 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-4723: Summary: NodesListManager$UnknownNodeId ClassCastException Key: YARN-4723 URL: https://issues.apache.org/jira/browse/YARN-4723 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-4701) When task logs are not available, port 8041 is referenced instead of port 8042

2016-02-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159168#comment-15159168 ] Haibo Chen commented on YARN-4701: -- The asf warning is unrelated to the patch > When task logs are not

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159130#comment-15159130 ] Jason Lowe commented on YARN-4705: -- Agreed that rawfile:// could prove useful outside of this scenario and

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159115#comment-15159115 ] Steve Loughran commented on YARN-4705: -- yeah, thought of that... tried patch file:// but it didn't

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159102#comment-15159102 ] Jason Lowe commented on YARN-4705: -- Another workaround could be a scheme to get a raw local filesystem

[jira] [Updated] (YARN-4722) AsyncDispatcher logs redundant event queue sizes

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4722: - Attachment: YARN-4722.001.patch Patch that avoids logging the event queue size if it hasn't changed since

[jira] [Created] (YARN-4722) AsyncDispatcher logs redundant event queue sizes

2016-02-23 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-4722: Summary: AsyncDispatcher logs redundant event queue sizes Key: YARN-4722 URL: https://issues.apache.org/jira/browse/YARN-4722 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-4634) Scheduler UI/Metrics need to consider cases like non-queue label mappings

2016-02-23 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4634: -- Attachment: 0002-YARN-4634.patch Thank you [~leftnoteasy] Addressing this case in this new patch. I tested few

[jira] [Commented] (YARN-4694) Document ATS v1.5

2016-02-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159015#comment-15159015 ] Steve Loughran commented on YARN-4694: -- As discussed in YARN-4705, this doc must cover which

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159012#comment-15159012 ] Steve Loughran commented on YARN-4705: -- OK. so HDFS has guaranteed flush but no guarantees on modtime

[jira] [Commented] (YARN-4720) Skip unnecessary NN operations in log aggregation

2016-02-23 Thread Ming Ma (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158999#comment-15158999 ] Ming Ma commented on YARN-4720: --- Thanks [~hex108] for the patch. It addresses the first scenario of long

[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2016-02-23 Thread Kuhu Shukla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v11.patch Rebasing patch after YARN-3223. Requesting [~jlowe], [~templedf] for

[jira] [Commented] (YARN-4624) NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

2016-02-23 Thread Brahma Reddy Battula (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158996#comment-15158996 ] Brahma Reddy Battula commented on YARN-4624: Uploaded the patch ,kindly Review.. > NPE in

[jira] [Updated] (YARN-4624) NPE in PartitionQueueCapacitiesInfo while accessing Schduler UI

2016-02-23 Thread Brahma Reddy Battula (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-4624: --- Attachment: YARN-4624-003.patch > NPE in PartitionQueueCapacitiesInfo while accessing

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158913#comment-15158913 ] Jason Lowe commented on YARN-4705: -- bq. One RPC call to check the file size shouldn't be a big problem in

[jira] [Commented] (YARN-4716) TimelineClient to implement Flushable; propagate to writer

2016-02-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158882#comment-15158882 ] Steve Loughran commented on YARN-4716: -- ..no, I'm expecting the reader to follow its normal

[jira] [Commented] (YARN-4720) Skip unnecessary NN operations in log aggregation

2016-02-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158869#comment-15158869 ] Jun Gong commented on YARN-4720: Thanks [~mingma] for reporting. I just attached a patch for it. > Skip

[jira] [Updated] (YARN-4720) Skip unnecessary NN operations in log aggregation

2016-02-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-4720: --- Attachment: YARN-4720.01.patch > Skip unnecessary NN operations in log aggregation >

[jira] [Assigned] (YARN-4720) Skip unnecessary NN operations in log aggregation

2016-02-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong reassigned YARN-4720: -- Assignee: Jun Gong > Skip unnecessary NN operations in log aggregation >

[jira] [Created] (YARN-4721) RM to try to auth with HDFS on startup, retry with max diagnostics on failure

2016-02-23 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-4721: Summary: RM to try to auth with HDFS on startup, retry with max diagnostics on failure Key: YARN-4721 URL: https://issues.apache.org/jira/browse/YARN-4721 Project:

[jira] [Commented] (YARN-3223) Resource update during NM graceful decommission

2016-02-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158754#comment-15158754 ] Hudson commented on YARN-3223: -- FAILURE: Integrated in Hadoop-trunk-Commit #9350 (See

[jira] [Assigned] (YARN-4677) RMNodeResourceUpdateEvent update from scheduler can lead to race condition

2016-02-23 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned YARN-4677: Assignee: Junping Du > RMNodeResourceUpdateEvent update from scheduler can lead to race condition >

[jira] [Commented] (YARN-4648) Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption

2016-02-23 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158744#comment-15158744 ] Kai Sasaki commented on YARN-4648: -- [~ozawa] Thanks for reviewing! > Move preemption related tests from

[jira] [Commented] (YARN-4648) Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption

2016-02-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158732#comment-15158732 ] Hudson commented on YARN-4648: -- FAILURE: Integrated in Hadoop-trunk-Commit #9349 (See

[jira] [Commented] (YARN-4648) Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption

2016-02-23 Thread Tsuyoshi Ozawa (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158717#comment-15158717 ] Tsuyoshi Ozawa commented on YARN-4648: -- Note: The failures of TestClientRMTokens and

[jira] [Commented] (YARN-4648) Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption

2016-02-23 Thread Tsuyoshi Ozawa (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158713#comment-15158713 ] Tsuyoshi Ozawa commented on YARN-4648: -- +1, checking this in. > Move preemption related tests from

[jira] [Commented] (YARN-4705) ATS 1.5 parse pipeline to consider handling open() events recoverably

2016-02-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158680#comment-15158680 ] Steve Loughran commented on YARN-4705: -- That's what confuses me. After a scan of an empty file/failed

[jira] [Commented] (YARN-4651) movetoqueue option does not documented in 'YARN Commands'

2016-02-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158662#comment-15158662 ] Hudson commented on YARN-4651: -- FAILURE: Integrated in Hadoop-trunk-Commit #9347 (See

[jira] [Updated] (YARN-4651) movetoqueue option does not documented in 'YARN Commands'

2016-02-23 Thread Rohith Sharma K S (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4651: Labels: documentation (was: ) > movetoqueue option does not documented in 'YARN Commands' >

  1   2   >