[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-9320:
---
Description: 
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large so we expect node failures/restart 
frequently; I see this happens a couple of times (so it's not really "fatal") 
among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls
{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}

  was:
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures/restart frequently; I see this happens a couple of times (so it's not 
really "fatal") among a bunch of audit logging for 
"OPERATION=replaceLabelsOnNode" calls
{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}


> ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
> -
>
> Key: YARN-9320
> URL: https://issues.apache.org/jira/browse/YARN-9320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.3
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the 
> top of my head what version it corresponds to. I can look it up if that's 
> important, but I haven't found a bug like this so I suspect it would also 
> affect a current version unless fixed by accident.
> If it helps, the cluster is very large so we expect node failures/restart 
> frequently; I see this happens a couple of times (so it's not really "fatal") 
> among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls
> {noformat}
> 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
> 

[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-9320:
---
Description: 
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures/restart frequently; I see this happens a couple of times (so it's not 
really "fatal") among a bunch of audit logging for 
"OPERATION=replaceLabelsOnNode" calls
{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}

  was:
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures/restart frequently; also some apps may have misconfigured node labels 
specified so node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}


> ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
> -
>
> Key: YARN-9320
> URL: https://issues.apache.org/jira/browse/YARN-9320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.3
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the 
> top of my head what version it corresponds to. I can look it up if that's 
> important, but I haven't found a bug like this so I suspect it would also 
> affect a current version unless fixed by accident.
> If it helps, the cluster is very large (1000s of NMs) so we expect node 
> failures/restart frequently; I see this happens a couple of times (so it's 
> not really "fatal") among a bunch of audit logging for 
> "OPERATION=replaceLabelsOnNode" calls
> {noformat}
> 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event 

[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-9320:
---
Description: 
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures frequently; also some apps may have misconfigured node labels 
specified so node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}

  was:
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures frequently; also some apps may have misconfigured node labels 
specified spo node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}


> ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
> -
>
> Key: YARN-9320
> URL: https://issues.apache.org/jira/browse/YARN-9320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.3
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the 
> top of my head what version it corresponds to. I can look it up if that's 
> important, but I haven't found a bug like this so I suspect it would also 
> affect a current version unless fixed by accident.
> If it helps, the cluster is very large (1000s of NMs) so we expect node 
> failures frequently; also some apps may have misconfigured node labels 
> specified so node label related stuff may go into corner cases. Still, this 
> shouldn't happen based on a user-supplied parameter.
> {noformat}
> 

[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-9320:
---
Description: 
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures/restart frequently; also some apps may have misconfigured node labels 
specified so node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}

  was:
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures frequently; also some apps may have misconfigured node labels 
specified so node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}


> ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
> -
>
> Key: YARN-9320
> URL: https://issues.apache.org/jira/browse/YARN-9320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.3
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the 
> top of my head what version it corresponds to. I can look it up if that's 
> important, but I haven't found a bug like this so I suspect it would also 
> affect a current version unless fixed by accident.
> If it helps, the cluster is very large (1000s of NMs) so we expect node 
> failures/restart frequently; also some apps may have misconfigured node 
> labels specified so node label related stuff may go into corner cases. Still, 
> this shouldn't happen based on a user-supplied parameter.
> 

[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-9320:
---
Description: 
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

If it helps, the cluster is very large (1000s of NMs) so we expect node 
failures frequently; also some apps may have misconfigured node labels 
specified spo node label related stuff may go into corner cases. Still, this 
shouldn't happen based on a user-supplied parameter.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}

  was:
We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}


> ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
> -
>
> Key: YARN-9320
> URL: https://issues.apache.org/jira/browse/YARN-9320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.3
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the 
> top of my head what version it corresponds to. I can look it up if that's 
> important, but I haven't found a bug like this so I suspect it would also 
> affect a current version unless fixed by accident.
> If it helps, the cluster is very large (1000s of NMs) so we expect node 
> failures frequently; also some apps may have misconfigured node labels 
> specified spo node label related stuff may go into corner cases. Still, this 
> shouldn't happen based on a user-supplied parameter.
> {noformat}
> 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils:
>  queueCapacities.getNodePartitionsSet() changed 
> java.util.ConcurrentModificationException
>   at 

[jira] [Created] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)

2019-02-20 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-9320:
--

 Summary: ConcurrentModificationException in capacity scheduler 
(updateQueueStatistics)
 Key: YARN-9320
 URL: https://issues.apache.org/jira/browse/YARN-9320
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.9.3
Reporter: Sergey Shelukhin


We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top 
of my head what version it corresponds to. I can look it up if that's 
important, but I haven't found a bug like this so I suspect it would also 
affect a current version unless fixed by accident.

{noformat}
2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: 
queueCapacities.getNodePartitionsSet() changed 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
at java.util.HashMap$KeyIterator.next(HashMap.java:1461)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67)

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8091) Revisit checkUserAccessToQueue RM REST API

2018-04-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423217#comment-16423217
 ] 

Sergey Shelukhin commented on YARN-8091:


Thanks for the change! Btw, the output above is XML but really the output is 
json. Looks like xml is specific to chrome.

> Revisit checkUserAccessToQueue RM REST API
> --
>
> Key: YARN-8091
> URL: https://issues.apache.org/jira/browse/YARN-8091
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8091.001.patch
>
>
> As offline suggested by [~sershe]. Currently design of the 
> checkUserAccessToQueue mixed config-related issues (like user doesn't access 
> to the URL) and user-facing output (like requested user is not permitted to 
> access the queue) in the same code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8091) Revisit checkUserAccessToQueue RM REST API

2018-03-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420066#comment-16420066
 ] 

Sergey Shelukhin commented on YARN-8091:


The patch looks good to me, although I was only able to test on non-secure 
cluster, don't have a kerberized one handy.
I deployed on some cluster and can see {noformat}

false
hive1

User=hive1 doesn't have access to queue=foo with acl-type=SUBMIT_APPLICATIONS



...

true
hive1


{noformat}

> Revisit checkUserAccessToQueue RM REST API
> --
>
> Key: YARN-8091
> URL: https://issues.apache.org/jira/browse/YARN-8091
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8091.001.patch
>
>
> As offline suggested by [~sershe]. Currently design of the 
> checkUserAccessToQueue mixed config-related issues (like user doesn't access 
> to the URL) and user-facing output (like requested user is not permitted to 
> access the queue) in the same code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7523) Introduce description and version field in Service record

2018-03-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401316#comment-16401316
 ] 

Sergey Shelukhin commented on YARN-7523:


I understand YS is not released yet, so technically nobody should be using 
it... but for future is it possible to make these changes in backward 
compatible manner to avoid everyone having to update after every release (or 
after every patch in this case, if dogfooding YS)? cc [~gsaha]

> Introduce description and version field in Service record
> -
>
> Key: YARN-7523
> URL: https://issues.apache.org/jira/browse/YARN-7523
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Chandni Singh
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: YARN-7523.001.patch, YARN-7523.002.patch, 
> YARN-7523.003.patch, YARN-7523.004.patch
>
>
> YARN-7512 would need version field in Service record. It would be good to 
> introduce a description field also to allow service owners to capture some 
> details which can be used to display in Service catalog as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7582) Yarn Services - restore descriptive exception types

2017-11-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-7582:
--

 Summary: Yarn Services - restore descriptive exception types
 Key: YARN-7582
 URL: https://issues.apache.org/jira/browse/YARN-7582
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


Slider used to throw descriptive exceptions like UnknownApp, etc. from various 
commands (e.g. destroy). It looks like YARN Services throw generic exceptions 
from these (see the review in HIVE-18037). 
It would be good to restore the exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6000) Make AllocationFileLoaderService.Listener public

2017-01-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812994#comment-15812994
 ] 

Sergey Shelukhin edited comment on YARN-6000 at 1/9/17 10:06 PM:
-

Thanks! As for Hive requirements, see the code snippet above. We are using the 
listener because that seems to be the only way to get the updated value out. We 
just need to get the allocConf that we use to get queuepolicy, and then get the 
queue


was (Author: sershe):
Thanks! As for Hive requirements, see the code snippet above. We are using the 
listener because that seems to be the only way to get the updated value out. We 
just need to get the allocConf/queue

> Make AllocationFileLoaderService.Listener public
> 
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-6000.001.patch
>
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6000) Make AllocationFileLoaderService.Listener public

2017-01-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812994#comment-15812994
 ] 

Sergey Shelukhin commented on YARN-6000:


Thanks! As for Hive requirements, see the code snippet above. We are using the 
listener because that seems to be the only way to get the updated value out. We 
just need to get the allocConf/queue

> Make AllocationFileLoaderService.Listener public
> 
>
> Key: YARN-6000
> URL: https://issues.apache.org/jira/browse/YARN-6000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Tao Jie
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-6000.001.patch
>
>
> We removed public modifier of {{AllocationFileLoaderService.Listener}} in 
> YARN-4997 since it trigger a findbugs warning. However it breaks Hive code in 
> {{FairSchedulerShim}}. 
> {code}
> AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
> allocsLoader.init(conf);
> allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() 
> {
>   @Override
>   public void onReload(AllocationConfiguration allocs) {
> allocConf.set(allocs);
>   }
> });
> try {
>   allocsLoader.reloadAllocations();
> } catch (Exception ex) {
>   throw new IOException("Failed to load queue allocations", ex);
> }
> if (allocConf.get() == null) {
>   allocConf.set(new AllocationConfiguration(conf));
> }
> QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
> if (queuePolicy != null) {
>   requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
> {code}
> As a result we should set the modifier back to public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746948#comment-15746948
 ] 

Sergey Shelukhin commented on YARN-4997:


We'd be ok with a different way; our existing code, as is, seems very 
convoluted to me. All we need is to get the correct placement policy 
(presumably, based on both config and the xml file that reloadAllocations uses).

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746522#comment-15746522
 ] 

Sergey Shelukhin edited comment on YARN-4997 at 12/13/16 10:58 PM:
---

this break the following piece of code in Hive that seems to be determining the 
default queue (I am not familiar with all the APIs called here), because the 
listener interface is made invisible while the setReloadListener API that it is 
passed into is still public.
{noformat}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{noformat}

Can you recommend a way to utilize reloadAllocations and get the queue (or 
placement policy, or allocConf) without invoking this? Or some other 
workaround. 
Or otherwise can you please change Listener back to public?

Checking the code in 2.6, it seems like we can getConfig from the loader, but 
whatever reloadAllocations might derive from the placement policy element in 
the xml file is not accessible.


was (Author: sershe):
this break the following piece of code in Hive that seems to be determining the 
default queue (I am not familiar with all the APIs called here), because the 
listener interface is made invisible while the setReloadListener API that it is 
passed into is still public.
{noformat}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{noformat}

Can you recommend a way to utilize reloadAllocations and get the queue (or 
placement policy, or allocConf) without invoking this? Or some other 
workaround. 
Or otherwise can you please change Listener back to public?

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746522#comment-15746522
 ] 

Sergey Shelukhin edited comment on YARN-4997 at 12/13/16 10:52 PM:
---

this break the following piece of code in Hive that seems to be determining the 
default queue (I am not familiar with all the APIs called here), because the 
listener interface is made invisible while the setReloadListener API that it is 
passed into is still public.
{noformat}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{noformat}

Can you recommend a way to utilize reloadAllocations and get the queue (or 
placement policy, or allocConf) without invoking this? Or some other 
workaround. 
Or otherwise can you please change Listener back to public?


was (Author: sershe):
this break the following piece of code in Hive that seems to be determining the 
default queue (I am not familiar with all the APIs called here), because the 
listener interface is made invisible while the setReloadListener API that it is 
passed into is still public.
{noformat}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{noformat}

Can you recommend a way to utilize reloadAllocations and get the queue (or 
placement policy, or allocConf) without invoking this? Or some other 
workaround. 

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746522#comment-15746522
 ] 

Sergey Shelukhin commented on YARN-4997:


this break the following piece of code in Hive that seems to be determining the 
default queue (I am not familiar with all the APIs called here), because the 
listener interface is made invisible while the setReloadListener API that it is 
passed into is still public.
{noformat}
AllocationFileLoaderService allocsLoader = new AllocationFileLoaderService();
allocsLoader.init(conf);
allocsLoader.setReloadListener(new AllocationFileLoaderService.Listener() {
  @Override
  public void onReload(AllocationConfiguration allocs) {
allocConf.set(allocs);
  }
});
try {
  allocsLoader.reloadAllocations();
} catch (Exception ex) {
  throw new IOException("Failed to load queue allocations", ex);
}
if (allocConf.get() == null) {
  allocConf.set(new AllocationConfiguration(conf));
}
QueuePlacementPolicy queuePolicy = allocConf.get().getPlacementPolicy();
if (queuePolicy != null) {
  requestedQueue = queuePolicy.assignAppToQueue(requestedQueue, userName);
{noformat}

Can you recommend a way to utilize reloadAllocations and get the queue (or 
placement policy, or allocConf) without invoking this? Or some other 
workaround. 

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4997-001.patch, YARN-4997-002.patch, 
> YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, 
> YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, 
> YARN-4997-009.patch, YARN-4997-010.patch, YARN-4997-011.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-10-10 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563211#comment-15563211
 ] 

Sergey Shelukhin commented on YARN-5659:


Thank you for the reviews!

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, 
> YARN-5659.05.patch, YARN-5659.05.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-10-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.05.patch

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, 
> YARN-5659.05.patch, YARN-5659.05.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-10-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.05.patch

Added the annotations.
Please feel free to change the patch wrt whitespace, annotations, method names, 
and other such stuff that is easier to change on commit than to have 
back-and-forth on the jira.

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, 
> YARN-5659.05.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530399#comment-15530399
 ] 

Sergey Shelukhin commented on YARN-5659:


Hmm.. which one should I add? I am not very familiar with approach in YARN. Can 
someone add those on commit?

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527033#comment-15527033
 ] 

Sergey Shelukhin commented on YARN-5659:


This is an added overload, so it doesn't break the public API - the old one is 
still there; the config method is required to allow the test to override the 
RecordFactoryProvider configuration.

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517984#comment-15517984
 ] 

Sergey Shelukhin commented on YARN-5659:


[~templedf] does the patch make sense now? only the whitespace was changed 
since the last iteration. 
[~hitesh] fyi this one is ready

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.04.patch

The patch without a spurious change..

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514660#comment-15514660
 ] 

Sergey Shelukhin commented on YARN-5659:


Apparently editing patches directly is not a good idea...

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.04.patch

whitespace fixes...

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511207#comment-15511207
 ] 

Sergey Shelukhin commented on YARN-5659:


[~templedf] the test already has paths without a schema

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.03.patch

Fixing the warnings

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.02.patch

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: HIVE-5659.02.patch

Added the test. The test only tests the static methods, so it uses a fake URL 
class to sidestep factory visibility issues.

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: (was: HIVE-5659.02.patch)

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.01.patch

The patch for trunk, where this method has been moved. What's the difference 
between trunk and master? Do both need to be fixed?

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Description: 
getPathFromYarnURL does some string shenanigans where  standard ctors should 
suffice.
There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
invalid, null should be used. 

  was:
getPathFromYarnURL does some string shenanigans where  standard ctors should 
suffice.
There are also bugs in it e.g. passing an empty string to the URI ctor is 
invalid, null should be used. 


> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-5659:
---
Attachment: YARN-5659.patch

The patch. Based on the reversal of getYarnURLFromPath/URI.
normalize is also unneeded since Path already does that.

[~hitesh] can you take a look?

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty string to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned YARN-5659:
--

Assignee: Sergey Shelukhin

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty string to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-5659:
--

 Summary: getPathFromYarnURL should use standard methods
 Key: YARN-5659
 URL: https://issues.apache.org/jira/browse/YARN-5659
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


getPathFromYarnURL does some string shenanigans where  standard ctors should 
suffice.
There are also bugs in it e.g. passing an empty string to the URI ctor is 
invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4562) YARN WebApp ignores the configuration passed to it for keystore settings

2016-04-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228924#comment-15228924
 ] 

Sergey Shelukhin commented on YARN-4562:


That is true only if ssl-server.xml is present :) Yes, that works.

> YARN WebApp ignores the configuration passed to it for keystore settings
> 
>
> Key: YARN-4562
> URL: https://issues.apache.org/jira/browse/YARN-4562
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-4562.patch
>
>
> The conf can be passed to WebApps builder, however the following code in 
> WebApps.java that builds the HttpServer2 object:
> {noformat}
> if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) {
>   WebAppUtils.loadSslConfiguration(builder);
> }
> {noformat}
> ...results in loadSslConfiguration creating a new Configuration object; the 
> one that is passed in is ignored, as far as the keystore/etc. settings are 
> concerned.  loadSslConfiguration has another overload with Configuration 
> parameter that should be used instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4562) YARN WebApp ignores the configuration passed to it for keystore settings

2016-04-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15226854#comment-15226854
 ] 

Sergey Shelukhin commented on YARN-4562:


Hmm... ping?

> YARN WebApp ignores the configuration passed to it for keystore settings
> 
>
> Key: YARN-4562
> URL: https://issues.apache.org/jira/browse/YARN-4562
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: YARN-4562.patch
>
>
> The conf can be passed to WebApps builder, however the following code in 
> WebApps.java that builds the HttpServer2 object:
> {noformat}
> if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) {
>   WebAppUtils.loadSslConfiguration(builder);
> }
> {noformat}
> ...results in loadSslConfiguration creating a new Configuration object; the 
> one that is passed in is ignored, as far as the keystore/etc. settings are 
> concerned.  loadSslConfiguration has another overload with Configuration 
> parameter that should be used instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4558) Yarn client retries on some non-retriable exceptions

2016-01-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098722#comment-15098722
 ] 

Sergey Shelukhin commented on YARN-4558:


In this case, retry policy is built in YARN code. I don't know if there's more 
to it on Hadoop side than what YARN sets up, but from a cursory examination it 
doesn't look like that is the case.

> Yarn client retries on some non-retriable exceptions
> 
>
> Key: YARN-4558
> URL: https://issues.apache.org/jira/browse/YARN-4558
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Sergey Shelukhin
>Priority: Minor
>
> Seems the problem is in RMProxy where the policy is built.
> {noformat}
> Thread 23594: (state = BLOCKED)
> - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
> - org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(java.lang.Object, 
> java.lang.reflect.Method, java.lang.Object[]) @bci=603, line=155 (Interpreted 
> frame)
> - 
> com.sun.proxy.$Proxy32.getClusterNodes(org.apache.hadoop.yarn.api.protocolrecords.GetClusterNodesRequest)
>  @bci=16 (Interpreted frame)
> - 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(org.apache.hadoop.yarn.api.records.NodeState[])
>  @bci=66, line=515 (Interpreted frame)
> {noformat}
> produces
> {noformat}
> 2016-01-07 02:50:45,111 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : javax.security.sasl.SaslException: GSS initiate 
> failed [Caused by GSSException: No valid credentials provided (Mechanism 
> level: Failed to find any Kerberos tgt)]
> 2016-01-07 02:51:15,126 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : javax.security.sasl.SaslException: GSS initiate 
> failed [Caused by GSSException: No valid credentials provided (Mechanism 
> level: Failed to find any Kerberos tgt)]
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4562) YARN WebApp ignores the configuration passed to it for keystore settings

2016-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-4562:
---
Attachment: YARN-4562.patch

Trivial patch. [~vinodkv] [~sseth] can you take a look?

> YARN WebApp ignores the configuration passed to it for keystore settings
> 
>
> Key: YARN-4562
> URL: https://issues.apache.org/jira/browse/YARN-4562
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: YARN-4562.patch
>
>
> The conf can be passed to WebApps builder, however the following code in 
> WebApps.java that builds the HttpServer2 object:
> {noformat}
> if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) {
>   WebAppUtils.loadSslConfiguration(builder);
> }
> {noformat}
> ...results in loadSslConfiguration creating a new Configuration object; the 
> one that is passed in is ignored, as far as the keystore/etc. settings are 
> concerned.  loadSslConfiguration has another overload with Configuration 
> parameter that should be used instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4562) YARN WebApp ignores the configuration passed to it for keystore settings

2016-01-07 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-4562:
--

 Summary: YARN WebApp ignores the configuration passed to it for 
keystore settings
 Key: YARN-4562
 URL: https://issues.apache.org/jira/browse/YARN-4562
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


The conf can be passed to WebApps builder, however the following code in 
WebApps.java that builds the HttpServer2 object:
{noformat}
if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) {
  WebAppUtils.loadSslConfiguration(builder);
}
{noformat}
...results in loadSslConfiguration creating a new Configuration object; the one 
that is passed in is ignored, as far as the keystore/etc. settings are 
concerned.  loadSslConfiguration has another overload with Configuration 
parameter that should be used instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4562) YARN WebApp ignores the configuration passed to it for keystore settings

2016-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15088627#comment-15088627
 ] 

Sergey Shelukhin commented on YARN-4562:


Or [~hitesh]. I dunno who owns WebApp :)

> YARN WebApp ignores the configuration passed to it for keystore settings
> 
>
> Key: YARN-4562
> URL: https://issues.apache.org/jira/browse/YARN-4562
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: YARN-4562.patch
>
>
> The conf can be passed to WebApps builder, however the following code in 
> WebApps.java that builds the HttpServer2 object:
> {noformat}
> if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) {
>   WebAppUtils.loadSslConfiguration(builder);
> }
> {noformat}
> ...results in loadSslConfiguration creating a new Configuration object; the 
> one that is passed in is ignored, as far as the keystore/etc. settings are 
> concerned.  loadSslConfiguration has another overload with Configuration 
> parameter that should be used instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4558) Yarn client retries on some non-retriable exceptions

2016-01-07 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-4558:
--

 Summary: Yarn client retries on some non-retriable exceptions
 Key: YARN-4558
 URL: https://issues.apache.org/jira/browse/YARN-4558
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Sergey Shelukhin
Priority: Minor


Seems the problem is in RMProxy where the policy is built.
{noformat}
Thread 23594: (state = BLOCKED)
- java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
- org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(java.lang.Object, 
java.lang.reflect.Method, java.lang.Object[]) @bci=603, line=155 (Interpreted 
frame)
- 
com.sun.proxy.$Proxy32.getClusterNodes(org.apache.hadoop.yarn.api.protocolrecords.GetClusterNodesRequest)
 @bci=16 (Interpreted frame)
- 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(org.apache.hadoop.yarn.api.records.NodeState[])
 @bci=66, line=515 (Interpreted frame)
{noformat}
produces
{noformat}
2016-01-07 02:50:45,111 [main] WARN  ipc.Client - Exception encountered while 
connecting to the server : javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-01-07 02:51:15,126 [main] WARN  ipc.Client - Exception encountered while 
connecting to the server : javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
...
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-12-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045442#comment-15045442
 ] 

Sergey Shelukhin commented on YARN-1197:


Will this feature be usable in YARN/Hadoop 2.8? I see most subtasks are 
resolved but this JIRA is not resolve nor is there a release note.

> Support changing resources of an allocated container
> 
>
> Key: YARN-1197
> URL: https://issues.apache.org/jira/browse/YARN-1197
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, graceful, nodemanager, resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Wangda Tan
> Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
> YARN-1197_Design.2015.06.24.pdf, YARN-1197_Design.2015.07.07.pdf, 
> YARN-1197_Design.2015.08.21.pdf, YARN-1197_Design.pdf
>
>
> The current YARN resource management logic assumes resource allocated to a 
> container is fixed during the lifetime of it. When users want to change a 
> resource 
> of an allocated container the only way is releasing it and allocating a new 
> container with expected size.
> Allowing run-time changing resources of an allocated container will give us 
> better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4242) add analyze command to explictly cache file metadata in HBase metastore

2015-10-08 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-4242:
--

 Summary: add analyze command to explictly cache file metadata in 
HBase metastore
 Key: YARN-4242
 URL: https://issues.apache.org/jira/browse/YARN-4242
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4242) add analyze command to explictly cache file metadata in HBase metastore

2015-10-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved YARN-4242.

Resolution: Invalid

Wrong project

> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: YARN-4242
> URL: https://issues.apache.org/jira/browse/YARN-4242
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4207) Add a non-judgemental YARN app completion status

2015-10-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949644#comment-14949644
 ] 

Sergey Shelukhin commented on YARN-4207:


It's unassigned, so I gather noone is working on it. This plan sounds good to 
me (non-binding :))

> Add a non-judgemental YARN app completion status
> 
>
> Key: YARN-4207
> URL: https://issues.apache.org/jira/browse/YARN-4207
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>
> For certain applications, it doesn't make sense to have SUCCEEDED or FAILED 
> end state. For example, Tez sessions may include multiple DAGs, some of which 
> have succeeded and some have failed; there's no clear status for the session 
> both logically and from user perspective (users are confused either way). 
> There needs to be a status not implying success or failure, such as 
> "done"/"ended"/"finished".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4207) Add a non-judgemental YARN app completion status

2015-09-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-4207:
---
Summary: Add a non-judgemental YARN app completion status  (was: Add a more 
ambiguous YARN app completion status)

> Add a non-judgemental YARN app completion status
> 
>
> Key: YARN-4207
> URL: https://issues.apache.org/jira/browse/YARN-4207
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> For certain applications, it doesn't make sense to have SUCCEEDED or FAILED 
> end state. For example, Tez sessions may include multiple DAGs, some of which 
> have succeeded and some have failed; there's no clear status for the session 
> both logically and from user perspective (users are confused either way). 
> There needs to be a status not implying success or failure, such as 
> "done"/"ended"/"finished".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4207) Add a more ambiguous YARN app completion status

2015-09-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-4207:
--

 Summary: Add a more ambiguous YARN app completion status
 Key: YARN-4207
 URL: https://issues.apache.org/jira/browse/YARN-4207
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


For certain applications, it doesn't make sense to have SUCCEEDED or FAILED end 
state. For example, Tez sessions may include multiple DAGs, some of which have 
succeeded and some have failed; there's no clear status for the session both 
logically and from user perspective (users are confused either way). 
There needs to be a status not implying success or failure, such as 
"done"/"ended"/"finished".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4042) YARN registry should handle the absence of ZK node

2015-08-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-4042:
--

 Summary: YARN registry should handle the absence of ZK node
 Key: YARN-4042
 URL: https://issues.apache.org/jira/browse/YARN-4042
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


{noformat}
2015-08-10 11:33:46,931 WARN [LlapSchedulerNodeEnabler] 
rm.LlapTaskSchedulerService: Could not refresh list of active instances
org.apache.hadoop.fs.PathNotFoundException: 
`/registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25':
 No such file or directory: KeeperErrorCode = NoNode for 
/registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:377)
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:360)
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:720)
at 
org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.resolve(RegistryOperationsService.java:120)
at 
org.apache.hadoop.registry.client.binding.RegistryUtils.extractServiceRecords(RegistryUtils.java:321)
at 
org.apache.hadoop.registry.client.binding.RegistryUtils.listServiceRecords(RegistryUtils.java:177)
at 
org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl$DynamicServiceInstanceSet.refresh(LlapYarnRegistryImpl.java:278)
at 
org.apache.tez.dag.app.rm.LlapTaskSchedulerService.refreshInstances(LlapTaskSchedulerService.java:584)
at 
org.apache.tez.dag.app.rm.LlapTaskSchedulerService.access$900(LlapTaskSchedulerService.java:79)
at 
org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:887)
at 
org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:855)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for 
/registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:718)
... 12 more
{noformat}

ZK nodes can disappear after listing, for example ephemeral node can be cleaned 
up. YARN registry should handle that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1462) AHS API and other AHS changes to handle tags for completed MR jobs

2015-06-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574996#comment-14574996
 ] 

Sergey Shelukhin commented on YARN-1462:


Patch looks like it won't break Tez

 AHS API and other AHS changes to handle tags for completed MR jobs
 --

 Key: YARN-1462
 URL: https://issues.apache.org/jira/browse/YARN-1462
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Xuan Gong
 Fix For: 2.8.0

 Attachments: YARN-1462-branch-2.7-1.2.patch, 
 YARN-1462-branch-2.7-1.patch, YARN-1462.1.patch, YARN-1462.2.patch, 
 YARN-1462.3.patch, YARN-1462.4.patch


 AHS related work for tags. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1942) Many of ConverterUtils methods need to have public interfaces

2015-06-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14571761#comment-14571761
 ] 

Sergey Shelukhin commented on YARN-1942:


No, it's used in production code as far as I can tell

 Many of ConverterUtils methods need to have public interfaces
 -

 Key: YARN-1942
 URL: https://issues.apache.org/jira/browse/YARN-1942
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.4.0
Reporter: Thomas Graves
Assignee: Wangda Tan
Priority: Critical
 Attachments: YARN-1942.1.patch, YARN-1942.2.patch


 ConverterUtils has a bunch of functions that are useful to application 
 masters.   It should either be made public or we make some of the utilities 
 in it public or we provide other external apis for application masters to 
 use.  Note that distributedshell and MR are both using these interfaces. 
 For instance the main use case I see right now is for getting the application 
 attempt id within the appmaster:
 String containerIdStr =
   System.getenv(Environment.CONTAINER_ID.name());
 ConverterUtils.toContainerId
 ContainerId containerId = ConverterUtils.toContainerId(containerIdStr);
   ApplicationAttemptId applicationAttemptId =
   containerId.getApplicationAttemptId();
 I don't see any other way for the application master to get this information. 
  If there is please let me know.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1462) AHS API and other AHS changes to handle tags for completed MR jobs

2015-06-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569548#comment-14569548
 ] 

Sergey Shelukhin commented on YARN-1462:


[~sseth] can you please comment on the above (use of Private API)?

 AHS API and other AHS changes to handle tags for completed MR jobs
 --

 Key: YARN-1462
 URL: https://issues.apache.org/jira/browse/YARN-1462
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Xuan Gong
 Fix For: 2.8.0

 Attachments: YARN-1462-branch-2.7-1.2.patch, 
 YARN-1462-branch-2.7-1.patch, YARN-1462.1.patch, YARN-1462.2.patch, 
 YARN-1462.3.patch


 AHS related work for tags. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1462) AHS API and other AHS changes to handle tags for completed MR jobs

2015-06-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568189#comment-14568189
 ] 

Sergey Shelukhin commented on YARN-1462:


This commit changes newInstance API, breaking Tez build. It is hard to make it 
compatible with both pre-2.8 and 2.8... is it possible to preserve both 
versions of the method?

 AHS API and other AHS changes to handle tags for completed MR jobs
 --

 Key: YARN-1462
 URL: https://issues.apache.org/jira/browse/YARN-1462
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Xuan Gong
 Fix For: 2.8.0

 Attachments: YARN-1462-branch-2.7-1.2.patch, 
 YARN-1462-branch-2.7-1.patch, YARN-1462.1.patch, YARN-1462.2.patch, 
 YARN-1462.3.patch


 AHS related work for tags. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3674) YARN application disappears from view

2015-05-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550896#comment-14550896
 ] 

Sergey Shelukhin commented on YARN-3674:


I don't think so, unless filtering sticks even if you go and explicitly 
deselect it.
Maybe showing the current filter on the page would be a good start...

 YARN application disappears from view
 -

 Key: YARN-3674
 URL: https://issues.apache.org/jira/browse/YARN-3674
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 I have 2 tabs open at exact same URL with RUNNING applications view. There is 
 an application that is, in fact, running, that is visible in one tab but not 
 the other. This persists across refreshes. If I open new tab from the tab 
 where the application is not visible, in that tab it shows up ok.
 I didn't change scheduler/queue settings before this behavior happened; on 
 [~sseth]'s advice I went and tried to click the root node of the scheduler on 
 scheduler page; the app still does not become visible.
 Something got stuck somewhere...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3674) YARN application disappears from view

2015-05-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-3674:
--

 Summary: YARN application disappears from view
 Key: YARN-3674
 URL: https://issues.apache.org/jira/browse/YARN-3674
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin


I have 2 tabs open at exact same URL with RUNNING applications view. There is 
an application that is, in fact, running, that is visible in one tab but not 
the other. This persists across refreshes. If I open new tab from the tab where 
the application is not visible, in that tab it shows up ok.
I didn't change scheduler/queue settings before this behavior happened; on 
[~sseth]'s advice I went and tried to click the root node of the scheduler on 
scheduler page; the app still does not become visible.
Something got stuck somewhere...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533571#comment-14533571
 ] 

Sergey Shelukhin commented on YARN-3600:


[~vinodkv] ran into this recently... can you guys take a look

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Affects Version/s: 2.8.0

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Description: 
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
{noformat}
which obviously doesn't work


  was:
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: 
http://(snip):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is 
{noformat}http://cn042-10.l42scl.hortonworks.com:8088/cluster/N/A
{noformat}
which obviously doesn't work



 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Description: 
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container link goes to {noformat}http://(snip RM host 
name):8088/cluster/N/A
{noformat}
which obviously doesn't work


  was:
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
{noformat}
which obviously doesn't work



 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin moved HIVE-10654 to YARN-3600:
---

 Component/s: (was: Web UI)
Target Version/s: 2.8.0
 Key: YARN-3600  (was: HIVE-10654)
 Project: Hadoop YARN  (was: Hive)

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: 
 http://(snip):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container URL is 
 {noformat}http://cn042-10.l42scl.hortonworks.com:8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3370) don't show the exception message before showing container logs in UI

2015-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368193#comment-14368193
 ] 

Sergey Shelukhin commented on YARN-3370:


[~vinodkv] fyi

 don't show the exception message before showing container logs in UI
 

 Key: YARN-3370
 URL: https://issues.apache.org/jira/browse/YARN-3370
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 When you click on e.g. AM attempt logs, Exception: Unknown container ... 
 message is shown, then the page refreshes to logs. The message should not be 
 shown by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3370) don't show the exception message before showing container logs in UI

2015-03-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-3370:
--

 Summary: don't show the exception message before showing container 
logs in UI
 Key: YARN-3370
 URL: https://issues.apache.org/jira/browse/YARN-3370
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


When you click on e.g. AM attempt logs, Exception: Unknown container ... 
message is shown, then the page refreshes to logs. The message should not be 
shown by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)