[jira] [Commented] (YARN-3427) Remove deprecated methods from ResourceCalculatorProcessTree

2017-03-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946414#comment-15946414
 ] 

Hitesh Shah commented on YARN-3427:
---

Thanks for the heads up [~dan...@cloudera.com]. \cc [~sseth] 
[~rajesh.balamohan] 

> Remove deprecated methods from ResourceCalculatorProcessTree
> 
>
> Key: YARN-3427
> URL: https://issues.apache.org/jira/browse/YARN-3427
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Blocker
> Attachments: YARN-3427.000.patch, YARN-3427.001.patch
>
>
> In 2.7, we made ResourceCalculatorProcessTree Public and exposed some 
> existing ill-formed methods as deprecated ones for use by Tez.
> We should remove it in 3.0.0, considering that the methods have been 
> deprecated for the all 2.x.y releases that it is marked Public in. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-14 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665094#comment-15665094
 ] 

Hitesh Shah commented on YARN-1593:
---

Thanks [~vvasudev]. It does so partially. 

My concern is around the feedback loop in terms of failure handling by the apps 
when the system container dies at any of the following points: 
  - system container dies before an allocated container is launched on that node
  - it dies while a container is running
  - it dies after a container has completed

Would applications that define affinity to these system services now be getting 
updates (notifications) when system service containers go down or come back up? 
 

In addition to the feedback loop, is there any behavior change as a result of 
this? i.e. if the system container is not alive, will the app container still 
get launched given that its dependent service is down ( for shuffle, this might 
be ok if the system container eventually comes up but there might be other 
services that provide more synchronous functionality such as a caching layer? 

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651723#comment-15651723
 ] 

Hitesh Shah edited comment on YARN-1593 at 11/9/16 6:55 PM:


[~vvasudev] One question on the design doc. The doc does not seem to cover how 
user applications can define dependencies on these system services. For 
example, how to ensure that an MR/Tez/xyz container that requires the shuffle 
service does not get launched on a node where the system service is not 
running. This has 2 aspects - firstly how to ensure container allocations 
happen on correct nodes where these services are running and secondly, the 
service might be down when the container actually gets launched and therefore 
how the behavior will change as a result ( does the container eventually fail, 
does the NM itself stop the launch of the container and send an error back, 
etc).

Is this something that will be looked at later or should it be designed for 
from now itself to simplify the use of system services for user applications? 


was (Author: hitesh):
[~vvasudev] One question on the design doc. The doc does not seem to cover how 
user applications can define dependencies on these system services. For 
example, how to ensure that an MR/Tez/xyz container that requires the shuffle 
service does not get launched on a node where the system service is not 
running. This has 2 aspects - firstly how to ensure container allocations 
happen on correct nodes where these services are running and secondly, the 
service might be down when the container actually gets launched and therefore 
how the behavior will change as a result ( does the container eventually fail, 
does the NM itself stop the launch of the container and send an error back, 
etc).

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651723#comment-15651723
 ] 

Hitesh Shah commented on YARN-1593:
---

[~vvasudev] One question on the design doc. The doc does not seem to cover how 
user applications can define dependencies on these system services. For 
example, how to ensure that an MR/Tez/xyz container that requires the shuffle 
service does not get launched on a node where the system service is not 
running. This has 2 aspects - firstly how to ensure container allocations 
happen on correct nodes where these services are running and secondly, the 
service might be down when the container actually gets launched and therefore 
how the behavior will change as a result ( does the container eventually fail, 
does the NM itself stop the launch of the container and send an error back, 
etc).

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5759) Capability to register for a notification/callback on the expiry of timeouts

2016-10-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593093#comment-15593093
 ] 

Hitesh Shah commented on YARN-5759:
---

Will this address support for a post-app action executed by YARN after the 
application reaches an end state? i.e. somewhat like a finally block for a yarn 
app?   

> Capability to register for a notification/callback on the expiry of timeouts
> 
>
> Key: YARN-5759
> URL: https://issues.apache.org/jira/browse/YARN-5759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Gour Saha
>
> There is a need for the YARN native services REST-API service, to take 
> certain actions once a timeout of an application expires. For example, an 
> immediate requirement is to destroy a Slider application, once its lifetime 
> timeout expires and YARN has stopped the application. Destroying a Slider 
> application means cleanup of Slider HDFS state store and ZK paths for that 
> application. 
> Potentially, there will be advanced requirements from the REST-API service 
> and other services in the future, which will make this feature very handy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530407#comment-15530407
 ] 

Hitesh Shah commented on YARN-5659:
---

[~sershe] Just annotate the functions that are only required by unit tests with 
@Private and @VisibleForTesting 

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530407#comment-15530407
 ] 

Hitesh Shah edited comment on YARN-5659 at 9/28/16 6:06 PM:


[~sershe] Just annotate the functions that are only required by unit tests with 
@Private and @VisibleForTesting . In this case, the simplest approach would to 
use the above annotations for all the new functions that are added as part of 
this patch. 


was (Author: hitesh):
[~sershe] Just annotate the functions that are only required by unit tests with 
@Private and @VisibleForTesting 

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.01.patch, YARN-5659.02.patch, 
> YARN-5659.03.patch, YARN-5659.04.patch, YARN-5659.04.patch, YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3877) YarnClientImpl.submitApplication swallows exceptions

2016-09-21 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3877:
--
Target Version/s: 2.7.3  (was: 2.8.0)

> YarnClientImpl.submitApplication swallows exceptions
> 
>
> Key: YARN-3877
> URL: https://issues.apache.org/jira/browse/YARN-3877
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Varun Saxena
>Priority: Minor
> Attachments: YARN-3877.01.patch, YARN-3877.02.patch, 
> YARN-3877.03.patch
>
>
> When {{YarnClientImpl.submitApplication}} spins waiting for the application 
> to be accepted, any interruption during its Sleep() calls are logged and 
> swallowed.
> this makes it hard to interrupt the thread during shutdown. Really it should 
> throw some form of exception and let the caller deal with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5659) getPathFromYarnURL should use standard methods

2016-09-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507971#comment-15507971
 ] 

Hitesh Shah commented on YARN-5659:
---

\cc [~leftnoteasy] [~vvasudev] [~djp]

> getPathFromYarnURL should use standard methods
> --
>
> Key: YARN-5659
> URL: https://issues.apache.org/jira/browse/YARN-5659
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: YARN-5659.patch
>
>
> getPathFromYarnURL does some string shenanigans where  standard ctors should 
> suffice.
> There are also bugs in it e.g. passing an empty scheme to the URI ctor is 
> invalid, null should be used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5219) When an export var command fails in launch_container.sh, the full container launch should fail

2016-06-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-5219:
--
Description: 
Today, a container fails if certain files fail to localize. However, if certain 
env vars fail to get setup properly either due to bugs in the yarn application 
or misconfiguration, the actual process launch still gets triggered. This 
results in either confusing error messages if the process fails to launch or 
worse yet the process launches but then starts behaving wrongly if the env var 
is used to control some behavioral aspects. 

In this scenario, the issue was reproduced by trying to do export 
abc="$\{foo.bar}" which is invalid as var names cannot contain "." in bash. 

  was:
Today, a container fails if certain files fail to localize. However, if certain 
env vars fail to get setup properly either due to bugs in the yarn application 
or misconfiguration, the actual process launch still gets triggered. This 
results in either confusing error messages if the process fails to launch or 
worse yet the process launches but then starts behaving wrongly if the env var 
is used to control some behavioral aspects. 

In this scenario, the issue was reproduced by trying to do export 
abc="$\X{foo.bar}" which is invalid as var names cannot contain "." in bash. 


> When an export var command fails in launch_container.sh, the full container 
> launch should fail
> --
>
> Key: YARN-5219
> URL: https://issues.apache.org/jira/browse/YARN-5219
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>
> Today, a container fails if certain files fail to localize. However, if 
> certain env vars fail to get setup properly either due to bugs in the yarn 
> application or misconfiguration, the actual process launch still gets 
> triggered. This results in either confusing error messages if the process 
> fails to launch or worse yet the process launches but then starts behaving 
> wrongly if the env var is used to control some behavioral aspects. 
> In this scenario, the issue was reproduced by trying to do export 
> abc="$\{foo.bar}" which is invalid as var names cannot contain "." in bash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5219) When an export var command fails in launch_container.sh, the full container launch should fail

2016-06-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-5219:
--
Description: 
Today, a container fails if certain files fail to localize. However, if certain 
env vars fail to get setup properly either due to bugs in the yarn application 
or misconfiguration, the actual process launch still gets triggered. This 
results in either confusing error messages if the process fails to launch or 
worse yet the process launches but then starts behaving wrongly if the env var 
is used to control some behavioral aspects. 

In this scenario, the issue was reproduced by trying to do export 
abc="$\X{foo.bar}" which is invalid as var names cannot contain "." in bash. 

  was:
Today, a container fails if certain files fail to localize. However, if certain 
env vars fail to get setup properly either due to bugs in the yarn application 
or misconfiguration, the actual process launch still gets triggered. This 
results in either confusing error messages if the process fails to launch or 
worse yet the process launches but then starts behaving wrongly if the env var 
is used to control some behavioral aspects. 

In this scenario, the issue was reproduced by trying to do export 
abc="${foo.bar}" which is invalid as var names cannot contain "." in bash. 


> When an export var command fails in launch_container.sh, the full container 
> launch should fail
> --
>
> Key: YARN-5219
> URL: https://issues.apache.org/jira/browse/YARN-5219
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>
> Today, a container fails if certain files fail to localize. However, if 
> certain env vars fail to get setup properly either due to bugs in the yarn 
> application or misconfiguration, the actual process launch still gets 
> triggered. This results in either confusing error messages if the process 
> fails to launch or worse yet the process launches but then starts behaving 
> wrongly if the env var is used to control some behavioral aspects. 
> In this scenario, the issue was reproduced by trying to do export 
> abc="$\X{foo.bar}" which is invalid as var names cannot contain "." in bash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5219) When an export var command fails in launch_container.sh, the full container launch should fail

2016-06-08 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-5219:
-

 Summary: When an export var command fails in launch_container.sh, 
the full container launch should fail
 Key: YARN-5219
 URL: https://issues.apache.org/jira/browse/YARN-5219
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah


Today, a container fails if certain files fail to localize. However, if certain 
env vars fail to get setup properly either due to bugs in the yarn application 
or misconfiguration, the actual process launch still gets triggered. This 
results in either confusing error messages if the process fails to launch or 
worse yet the process launches but then starts behaving wrongly if the env var 
is used to control some behavioral aspects. 

In this scenario, the issue was reproduced by trying to do export 
abc="${foo.bar}" which is invalid as var names cannot contain "." in bash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5131) Distributed shell AM fails because of InterruptedException

2016-05-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297174#comment-15297174
 ] 

Hitesh Shah edited comment on YARN-5131 at 5/23/16 10:04 PM:
-

The error in the description is not really an error. The thread was interrupted 
and does not match the title related to an NPE. 


was (Author: hitesh):
The error in the description is not really an error. The thread was 
interrupted. 

> Distributed shell AM fails because of InterruptedException
> --
>
> Key: YARN-5131
> URL: https://issues.apache.org/jira/browse/YARN-5131
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>
> DShell AM fails with the following exception
> {code}
> INFO impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287)
> End of LogType:AppMaster.stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5131) Distributed shell AM fails because of InterruptedException

2016-05-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297174#comment-15297174
 ] 

Hitesh Shah commented on YARN-5131:
---

The error in the description is not really an error. The thread was 
interrupted. 

> Distributed shell AM fails because of InterruptedException
> --
>
> Key: YARN-5131
> URL: https://issues.apache.org/jira/browse/YARN-5131
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>
> DShell AM fails with the following exception
> {code}
> INFO impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287)
> End of LogType:AppMaster.stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files

2016-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287966#comment-15287966
 ] 

Hitesh Shah commented on YARN-1151:
---

For paths, the impl should do auto-resolving i.e. use default fs if no fs 
specified, support both file:// and hdfs:// which is likely to happen 
especially if we have a mix of jars coming in via rpms vs hdfs-based cache. 

> Ability to configure auxiliary services from HDFS-based JAR files
> -
>
> Key: YARN-1151
> URL: https://issues.apache.org/jira/browse/YARN-1151
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.1.0-beta, 2.9.0
>Reporter: john lilley
>Assignee: Xuan Gong
>  Labels: auxiliary-service, yarn
> Attachments: YARN-1151.1.patch
>
>
> I would like to install an auxiliary service in Hadoop YARN without actually 
> installing files/services on every node in the system.  Discussions on the 
> user@ list indicate that this is not easily done.  The reason we want an 
> auxiliary service is that our application has some persistent-data components 
> that are not appropriate for HDFS.  In fact, they are somewhat analogous to 
> the mapper output of MapReduce's shuffle, which is what led me to 
> auxiliary-services in the first place.  It would be much easier if we could 
> just place our service's JARs in HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files

2016-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287963#comment-15287963
 ] 

Hitesh Shah commented on YARN-1151:
---

How can one specify an archive? or would only simple files be supported? 
Additionally, would a fat jar ( jar-with-dependencies ) work out of the box?  

> Ability to configure auxiliary services from HDFS-based JAR files
> -
>
> Key: YARN-1151
> URL: https://issues.apache.org/jira/browse/YARN-1151
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.1.0-beta, 2.9.0
>Reporter: john lilley
>Assignee: Xuan Gong
>  Labels: auxiliary-service, yarn
> Attachments: YARN-1151.1.patch
>
>
> I would like to install an auxiliary service in Hadoop YARN without actually 
> installing files/services on every node in the system.  Discussions on the 
> user@ list indicate that this is not easily done.  The reason we want an 
> auxiliary service is that our application has some persistent-data components 
> that are not appropriate for HDFS.  In fact, they are somewhat analogous to 
> the mapper output of MapReduce's shuffle, which is what led me to 
> auxiliary-services in the first place.  It would be much easier if we could 
> just place our service's JARs in HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5079) [Umbrella] Native YARN framework layer for services and beyond

2016-05-13 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282902#comment-15282902
 ] 

Hitesh Shah commented on YARN-5079:
---

It might be better to have this discussion on the mailing lists instead of 
JIRA. 

> [Umbrella] Native YARN framework layer for services and beyond
> --
>
> Key: YARN-5079
> URL: https://issues.apache.org/jira/browse/YARN-5079
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>
> (See overview doc at YARN-4692, modifying and copy-pasting some of the 
> relevant pieces and sub-section 3.3.1 to track the specific sub-item.)
> (This is a companion to YARN-4793 in our effort to simplify the entire story, 
> but focusing on APIs)
> So far, YARN by design has restricted itself to having a very low-­level API 
> that can support any type of application. Frameworks like Apache Hadoop 
> MapReduce, Apache Tez, Apache Spark, Apache REEF, Apache Twill, Apache Helix 
> and others ended up exposing higher level APIs that end­-users can directly 
> leverage to build their applications on top of YARN. On the services side, 
> Apache Slider has done something similar.
> With our current attention on making services first­-class and simplified, 
> it's time to take a fresh look at how we can make Apache Hadoop YARN support 
> services well out of the box. Beyond the functionality that I outlined in the 
> previous sections in the doc on how NodeManagers can be enhanced to help 
> services, the biggest missing piece is the framework itself. There is a lot 
> of very important functionality that a services' framework can own together 
> with YARN in executing services end­-to­-end.
> In this JIRA I propose we look at having a native Apache Hadoop framework for 
> running services natively on YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2506) TimelineClient should NOT be in yarn-common project

2016-05-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280992#comment-15280992
 ] 

Hitesh Shah commented on YARN-2506:
---

Why not do this in trunk? 

> TimelineClient should NOT be in yarn-common project
> ---
>
> Key: YARN-2506
> URL: https://issues.apache.org/jira/browse/YARN-2506
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
>Priority: Critical
>
> YARN-2298 incorrectly moved TimelineClient to yarn-common project. It doesn't 
> belong there, we should move it back to yarn-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2506) TimelineClient should NOT be in yarn-common project

2016-05-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280994#comment-15280994
 ] 

Hitesh Shah commented on YARN-2506:
---

Seems like something useful as part of 3.x given that the client library is 
meant to be part of yarn-client-api

> TimelineClient should NOT be in yarn-common project
> ---
>
> Key: YARN-2506
> URL: https://issues.apache.org/jira/browse/YARN-2506
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
>Priority: Critical
>
> YARN-2298 incorrectly moved TimelineClient to yarn-common project. It doesn't 
> belong there, we should move it back to yarn-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5068) The AM does not know the queue from which it is launched.

2016-05-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279479#comment-15279479
 ] 

Hitesh Shah commented on YARN-5068:
---

The main use-case is this: 
  - tez runs multiple dags in a single yarn application
  - for each dag, we publish data to yarn timeline and support searching for 
these dags based on the data published. 
  - Timeline does not support doing searches which require a join between 
app-specific data and data published by the yarn framework. 
  - Even though AHS has queue data, a single webservice call to ATS cannot be 
used to retrieve dags that were submitted to a particular queue. 
  - To do this, we need to access the queue information in the AM and publish 
it along with the dag data as app-specific data and then use it for filtering. 



> The AM does not know the queue from which it is launched.
> -
>
> Key: YARN-5068
> URL: https://issues.apache.org/jira/browse/YARN-5068
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: MAPREDUCE-6692.patch
>
>
> The AM needs to know the queue name in which it was launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4851) Metric improvements for ATS v1.5 storage components

2016-04-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263118#comment-15263118
 ] 

Hitesh Shah commented on YARN-4851:
---

New entries look fine. And agree that the others can done in follow-up jiras. 

> Metric improvements for ATS v1.5 storage components
> ---
>
> Key: YARN-4851
> URL: https://issues.apache.org/jira/browse/YARN-4851
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4851-trunk.001.patch, YARN-4851-trunk.002.patch, 
> YARN-4851-trunk.003.patch
>
>
> We can add more metrics to the ATS v1.5 storage systems, including purging, 
> cache hit/misses, read latency, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261187#comment-15261187
 ] 

Hitesh Shah commented on YARN-4844:
---

bq. Per my understanding, changing from int to long won't affect downstream 
project a lot, it's an error which can be captured by compiler directly. And 
getMemory/getVCores should not be used intensively by downstream project. For 
example, MR uses only ~20 times of getMemory()/VCores for non-testing code. 
Which can be easily fixed.

If you are going to force downstream apps to change, I dont understand why you 
are not forcing them to do this in the first 3.0.0 release? What benefit does 
this provide anyone by delaying it to some later 3.x.y release? It just means 
that you have do the production stability verification of upstream apps all 
over again. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254970#comment-15254970
 ] 

Hitesh Shah commented on YARN-4844:
---

Additionally we are not talking about use in production but rather making 
upstream apps change as needed to work with 3.x and over time stabilize 3.x. 
Making an API change earlier rather than later is actually better as the API 
changes  in this case have no relevance to production stability. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254967#comment-15254967
 ] 

Hitesh Shah commented on YARN-4844:
---

bq.  considering there are hundreds of blockers and criticals of 3.0.0 release, 
nobody will actually use the new release in production even if 3.0-alpha can be 
released. We can mark Resource API of trunk to be unstable and update it in 
future 3.x releases.

So the plan is to force users to change their usage of these APIs in some 
version of 3.x but not in 3.0.0 ? 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4851) Metric improvements for ATS v1.5 storage components

2016-04-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254750#comment-15254750
 ] 

Hitesh Shah commented on YARN-4851:
---

Some general comments on usability ( have not reviewed the patch in detail)
   - names need a bit of work e.g. SummaryDataReadTimeNumOps and 
SummaryDataReadTimeAvgTime - not sure why NumOps has a relation to ReadTime and 
time in ReadTimeAvgTime seems redundant. 
   - would be good to have the scale in there i.e. is time in millis or 
seconds? 
   - updates to the timeline server docs for these metrics seems missing. 
   - what is the difference bet CacheRefreshTimeNumOps and CacheRefreshOps ? 
   - Likewise for LogCleanTimeNumOps vs LogsDirsCleaned  or PutDomainTimeNumOps 
vs PutDomainOps
   - cache eviction rates needed? 
   - how do we get a count of how many cache refreshes were due to stale data 
vs never cached/evicted earlier? do we need this?
   - should be there 2 levels of metrics - one group enabled by default and a 
second group for more detailed monitoring to reduce load on the metrics system?
   - would be good to understand the request count at the ATSv1.5 level itself 
to understand which calls end up going to summary vs cache vs fs-based lookups 
( i.e. across all gets ).
   - at the overall ATS level, an overall avg latency across all reqs might be 
useful for  a general health check
 



> Metric improvements for ATS v1.5 storage components
> ---
>
> Key: YARN-4851
> URL: https://issues.apache.org/jira/browse/YARN-4851
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4851-trunk.001.patch, YARN-4851-trunk.002.patch
>
>
> We can add more metrics to the ATS v1.5 storage systems, including purging, 
> cache hit/misses, read latency, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2016-04-22 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-4990:
-

 Summary: Re-direction of a particular log file within in a 
container in NM UI does not redirect properly to Log Server ( history ) on 
container completion
 Key: YARN-4990
 URL: https://issues.apache.org/jira/browse/YARN-4990
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah


The NM does the redirection to the history server correctly. However if the 
user is viewing or has a link to a particular specific file, the redirect ends 
up going to the top level page for the container and not redirecting to the 
specific file. Additionally, the start param to show logs from the offset 0 
also goes missing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253084#comment-15253084
 ] 

Hitesh Shah commented on YARN-4844:
---

bq. It is not a very hard thing to drop it, we'd better to do it close to first 
branch-3 release.

I believe a recent comment on the mailing list was trying to target a 3.0 
release within the next few weeks so I guess that means we make this change 
now? 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253028#comment-15253028
 ] 

Hitesh Shah commented on YARN-4844:
---

getMemoryLong(), etc just seems messy. I can understand why this is needed on 
branch-2 if we need to support long but for trunk, it seems better to change 
getMemory() to return a long. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2016-03-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-868:
-
Attachment: (was: YARN-868.0003.patch)

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-868.02.patch, YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2016-03-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186366#comment-15186366
 ] 

Hitesh Shah commented on YARN-868:
--

Removed patch 3 for now - discovered basic errors/wrong assumptions on the 
serviceAddr/renewer 

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-868.02.patch, YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2016-03-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185685#comment-15185685
 ] 

Hitesh Shah commented on YARN-868:
--

Sorry - mixed contents on the previous comments. There are 2 issues:
   - the service address should be set in tokens handed back to user-code by 
the framework itself
   - the user-code should not need to pass in the renewer string when 
requesting a token 

patch 2 seems to try address the former and patch 3 is my attempt to fix the 
latter. 

\cc [~vvasudev] [~vinodkv] for comments 


> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-868.0003.patch, YARN-868.02.patch, YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2016-03-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185618#comment-15185618
 ] 

Hitesh Shah commented on YARN-868:
--

[~varun_saxena] to add to your previous comment, yes and yes. I also made the 
ClientRMProxy api public given that it is currently unstable. 

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-868.0003.patch, YARN-868.02.patch, YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2016-03-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-868:
-
Attachment: YARN-868.0003.patch

[~varun_saxena] Sorry - this completely dropped off my radar. Looking at the 
patch and the current code, I tried to take a different approach and came up 
with a patch. 

[~vinodkv] Mind taking a look at patch 02 and 03 and giving your comments. 

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-868.0003.patch, YARN-868.02.patch, YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-12-07 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046055#comment-15046055
 ] 

Hitesh Shah commented on YARN-3996:
---

Why should a minimum resource of 0 be ever supported? 

> YARN-789 (Support for zero capabilities in fairscheduler) is broken after 
> YARN-3305
> ---
>
> Key: YARN-3996
> URL: https://issues.apache.org/jira/browse/YARN-3996
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Neelesh Srinivas Salian
>Priority: Critical
> Attachments: YARN-3996.001.patch, YARN-3996.002.patch, 
> YARN-3996.003.patch, YARN-3996.prelim.patch
>
>
> RMAppManager#validateAndCreateResourceRequest calls into normalizeRequest 
> with mininumResource for the incrementResource. This causes normalize to 
> return zero if minimum is set to zero as per YARN-789



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-11-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986731#comment-14986731
 ] 

Hitesh Shah commented on YARN-2513:
---

[~yeshavora] and I are trying to use this on a secure cluster. I can kinit and 
make a curl call to 
"/ws/v1/timeline/TEZ_DAG_ID?limit=1" and it works correctly but when trying to 
make a curl call to the hosted UI, it fails. 

I see that KerberosAuthenticationHandler.java:init(214) is being invoked twice. 

The error being thrown is: 

{code}
2015-11-03 05:01:45,864 WARN  server.AuthenticationFilter 
(AuthenticationFilter.java:doFilter(551)) - Authentication exception: 
GSSException: Failure unspecified at GSS-API level (Mechanism level: Request is 
a replay (34))
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: Failure unspecified at GSS-API level (Mechanism level: Request is 
a replay (34))
at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:347)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:507)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1225)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
{code}

[~jeagles]  [~vinodkv] [~steve_l] Any suggestions? 

> Host framework UIs in YARN for use with the ATS
> ---
>
> Key: YARN-2513
> URL: https://issues.apache.org/jira/browse/YARN-2513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 3.0.0, 2.8.0, 2.7.2
>
> Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
> YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch
>
>
> Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
> infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-11-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986732#comment-14986732
 ] 

Hitesh Shah commented on YARN-2513:
---

The below patch seems to work but I am not sure what else I may be breaking:

{code}
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/Applicatio
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/Applicatio
@@ -305,16 +305,16 @@ private void startWebApp() {
  WebAppContext uiWebAppContext = new WebAppContext();
  uiWebAppContext.setContextPath(webPath);
  uiWebAppContext.setWar(onDiskPath);
- final String[] ALL_URLS = { "/*" };
- FilterHolder[] filterHolders =
-   webAppContext.getServletHandler().getFilters();
- for (FilterHolder filterHolder: filterHolders) {
-   if (!"guice".equals(filterHolder.getName())) {
- HttpServer2.defineFilter(uiWebAppContext, filterHolder.getName(),
- filterHolder.getClassName(), filterHolder.getInitParameters(),
- ALL_URLS);
-   }
- }
+ //final String[] ALL_URLS = { "/*" };
+ //FilterHolder[] filterHolders =
+ //  webAppContext.getServletHandler().getFilters();
+ //for (FilterHolder filterHolder: filterHolders) {
+ //  if (!"guice".equals(filterHolder.getName())) {
+ //HttpServer2.defineFilter(uiWebAppContext, 
filterHolder.getName(),
+ //filterHolder.getClassName(), 
filterHolder.getInitParameters(),
+ //ALL_URLS);
+ //  }
+ //}
{code}

> Host framework UIs in YARN for use with the ATS
> ---
>
> Key: YARN-2513
> URL: https://issues.apache.org/jira/browse/YARN-2513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 3.0.0, 2.8.0, 2.7.2
>
> Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
> YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch
>
>
> Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
> infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4323) AMRMClient does not respect SchedulerResourceTypes post YARN-2448

2015-10-30 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-4323:
--
Affects Version/s: 2.6.0

> AMRMClient does not respect SchedulerResourceTypes post YARN-2448
> -
>
> Key: YARN-4323
> URL: https://issues.apache.org/jira/browse/YARN-4323
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Hitesh Shah
>
> Given that the RM now informs the AM of the resources it supports, AMRMClient 
> should be changed to match correctly by normalizing the invalid resource 
> types.
> i.e. AMRMClient::getMatchingRequests() should correctly return back matches 
> by only looking at the resource types that are valid. 
> \cc [~vvasudev] [~bikassaha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4323) AMRMClient does not respect SchedulerResourceTypes post YARN-2448

2015-10-30 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-4323:
-

 Summary: AMRMClient does not respect SchedulerResourceTypes post 
YARN-2448
 Key: YARN-4323
 URL: https://issues.apache.org/jira/browse/YARN-4323
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah


Given that the RM now informs the AM of the resources it supports, AMRMClient 
should be changed to match correctly by normalizing the invalid resource types.

i.e. AMRMClient::getMatchingRequests() should correctly return back matches by 
only looking at the resource types that are valid. 

\cc [~vvasudev] [~bikassaha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-10-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967596#comment-14967596
 ] 

Hitesh Shah commented on YARN-2513:
---

Tested the latest patch with multiple UIs being hosted. Works fine now. 

> Host framework UIs in YARN for use with the ATS
> ---
>
> Key: YARN-2513
> URL: https://issues.apache.org/jira/browse/YARN-2513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
> YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch
>
>
> Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
> infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-10-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967598#comment-14967598
 ] 

Hitesh Shah commented on YARN-2513:
---

+1

> Host framework UIs in YARN for use with the ATS
> ---
>
> Key: YARN-2513
> URL: https://issues.apache.org/jira/browse/YARN-2513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
> YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch
>
>
> Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
> infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-10-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941935#comment-14941935
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] Did you get a chance to look at the latest patch? 

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch, YARN-4009.005.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901169#comment-14901169
 ] 

Hitesh Shah commented on YARN-4009:
---

Thinking more on this, a global config might be something that is okay to start 
with ( we already have a huge proliferation of configs which users do not set 
). If there are concerns raised down the line, it should likely be easy enough 
to add yarn and hdfs specific configs which would override the global one in a 
compatible manner? [~jeagles] comments? 



> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791225#comment-14791225
 ] 

Hitesh Shah commented on YARN-4009:
---

Couple of questions: 

{code}
if (!initializers.contains(CrossOriginFilterInitializer.class.getName())) {
  if(conf.getBoolean(YarnConfiguration
  .TIMELINE_SERVICE_HTTP_CROSS_ORIGIN_ENABLED, YarnConfiguration
  .TIMELINE_SERVICE_HTTP_CROSS_ORIGIN_ENABLED_DEFAULT)) {
initializers = CrossOriginFilterInitializer.class.getName() + ","
+ initializers;
modifiedInitializers = true;
  }
}
{code}

I see this code in Timeline which makes it easier to enable cross-origin 
support just for Timeline. I am assuming Timeline also looks at the hadoop 
filters defined in core-site? What happens when both of these are enabled at 
the same time with different settings?

Not sure if there is a question of selecting enabling cors support for 
different services such as NN webservices vs RM webservices.

Apart from the above, if a global config is good enough, patch looks good.
 



> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791228#comment-14791228
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] Any comments on the patch?

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791226#comment-14791226
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] ? 

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS

2015-09-12 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742159#comment-14742159
 ] 

Hitesh Shah commented on YARN-3942:
---

Thanks [~gss2002]. One point to note - if you use long running Hive sessions, 
this will cause an OOM in the timeline server as the data is cached on a per 
"session" basis. I am not sure if there is another simple way to disable Hive 
session re-use in the HiveServer \cc [~vikram.dixit]

> Timeline store to read events from HDFS
> ---
>
> Key: YARN-3942
> URL: https://issues.apache.org/jira/browse/YARN-3942
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-3942.001.patch
>
>
> This adds a new timeline store plugin that is intended as a stop-gap measure 
> to mitigate some of the issues we've seen with ATS v1 while waiting for ATS 
> v2.  The intent of this plugin is to provide a workable solution for running 
> the Tez UI against the timeline server on a large-scale clusters running many 
> thousands of jobs per day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS

2015-09-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729815#comment-14729815
 ] 

Hitesh Shah commented on YARN-3942:
---

Some ideas from an offline discussion with [~bikassaha] and [~vinodkv]:

- option 1) Could we just use leveldb as an LRU cache instead of a memory 
based cache to handle the OOM issue?
- option 2) Could we just take the data from HDFS and write it out to 
leveldb and using the level db to serve data out? This would address the OOM 
issue too. 

\cc [~jlowe] [~jeagles]


> Timeline store to read events from HDFS
> ---
>
> Key: YARN-3942
> URL: https://issues.apache.org/jira/browse/YARN-3942
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-3942.001.patch
>
>
> This adds a new timeline store plugin that is intended as a stop-gap measure 
> to mitigate some of the issues we've seen with ATS v1 while waiting for ATS 
> v2.  The intent of this plugin is to provide a workable solution for running 
> the Tez UI against the timeline server on a large-scale clusters running many 
> thousands of jobs per day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS

2015-08-28 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720574#comment-14720574
 ] 

Hitesh Shah commented on YARN-3942:
---

[~jlowe] [~rajesh.balamohan] observed that the timeline server was running out 
of memory in a certain scenario. In this scenario, we are using Hive-on-Tez but 
Hive re-uses the application to run 100s of DAGs/queries (doAs=false with 
perimeter security using say Ranger or Sentry). The EntityFileStore sizes a 
cache based on the no. of applications it can cache but in the above scenario, 
even a single app could be very large. Ideally, if each dag was in a separate 
file and all of its entries treated as a single cache entity - that would 
probably work better but making this generic enough may be a bit tricky.

Any suggestions here? 



 Timeline store to read events from HDFS
 ---

 Key: YARN-3942
 URL: https://issues.apache.org/jira/browse/YARN-3942
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3942.001.patch


 This adds a new timeline store plugin that is intended as a stop-gap measure 
 to mitigate some of the issues we've seen with ATS v1 while waiting for ATS 
 v2.  The intent of this plugin is to provide a workable solution for running 
 the Tez UI against the timeline server on a large-scale clusters running many 
 thousands of jobs per day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4085) Generate file with container resource limits in the container work dir

2015-08-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717524#comment-14717524
 ] 

Hitesh Shah commented on YARN-4085:
---

Set values in the environment as compared to a file? If a file, should that be 
a properties file with all useful information written into it and not just the 
resource size info? 

 Generate file with container resource limits in the container work dir
 --

 Key: YARN-4085
 URL: https://issues.apache.org/jira/browse/YARN-4085
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
Priority: Minor

 Currently, a container doesn't know what resource limits are being imposed on 
 it. It would be helpful if the NM generated a simple file in the container 
 work dir with the resource limits specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717526#comment-14717526
 ] 

Hitesh Shah commented on YARN-4087:
---

It would be good to rename the config property to something that provides a bit 
more clarity on what the config knob is meant to control. 

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3944) Connection refused to nodemanagers are retried at multiple levels

2015-08-19 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3944:
--
Labels: 2.6.1-candidate  (was: )

 Connection refused to nodemanagers are retried at multiple levels
 -

 Key: YARN-3944
 URL: https://issues.apache.org/jira/browse/YARN-3944
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
  Labels: 2.6.1-candidate
 Attachments: YARN-3944.v1.patch


 This is related to YARN-3238. When NM is down, ipc client will get 
 ConnectException.
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
 However, retry happens at two layers(ipc retry 40 times and serverProxy 
 retrying 91 times), this could end up with ~1 hour retry interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4047) ClientRMService getApplications has high scheduler lock contention

2015-08-11 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-4047:
--
Labels: 2.6.1-candidate  (was: )

 ClientRMService getApplications has high scheduler lock contention
 --

 Key: YARN-4047
 URL: https://issues.apache.org/jira/browse/YARN-4047
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Jason Lowe
Assignee: Jason Lowe
  Labels: 2.6.1-candidate
 Attachments: YARN-4047.001.patch


 The getApplications call can be particuarly expensive because the code can 
 call checkAccess on every application being tracked by the RM.  checkAccess 
 will often call scheduler.checkAccess which will grab the big scheduler lock. 
  This can cause a lot of contention with the scheduler thread which is busy 
 trying to process node heartbeats, app allocation requests, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3978) Configurably turn off the saving of container info in Generic AHS

2015-08-11 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3978:
--
Labels: 2.6.1-candidate  (was: )

 Configurably turn off the saving of container info in Generic AHS
 -

 Key: YARN-3978
 URL: https://issues.apache.org/jira/browse/YARN-3978
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver, yarn
Affects Versions: 2.8.0, 2.7.1
Reporter: Eric Payne
Assignee: Eric Payne
  Labels: 2.6.1-candidate
 Fix For: 3.0.0, 2.8.0, 2.7.2

 Attachments: YARN-3978.001.patch, YARN-3978.002.patch, 
 YARN-3978.003.patch, YARN-3978.004.patch


 Depending on how each application's metadata is stored, one week's worth of 
 data stored in the Generic Application History Server's database can grow to 
 be almost a terabyte of local disk space. In order to alleviate this, I 
 suggest that there is a need for a configuration option to turn off saving of 
 non-AM container metadata in the GAHS data store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4032) Corrupted state from a previous version can still cause RM to fail with NPE due to same reasons as YARN-2834

2015-08-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-4032:
--
Labels: 2.6.1-candidate  (was: )

 Corrupted state from a previous version can still cause RM to fail with NPE 
 due to same reasons as YARN-2834
 

 Key: YARN-4032
 URL: https://issues.apache.org/jira/browse/YARN-4032
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
Priority: Critical
  Labels: 2.6.1-candidate

 YARN-2834 ensures in 2.6.0 there will not be any inconsistent state. But if 
 someone is upgrading from a previous version, the state can still be 
 inconsistent and then RM will still fail with NPE after upgrade to 2.6.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1809) Synchronize RM and Generic History Service Web-UIs

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-1809:
--
Labels: 2.6.1-candidate  (was: )

 Synchronize RM and Generic History Service Web-UIs
 --

 Key: YARN-1809
 URL: https://issues.apache.org/jira/browse/YARN-1809
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Xuan Gong
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: YARN-1809.1.patch, YARN-1809.10.patch, 
 YARN-1809.11.patch, YARN-1809.12.patch, YARN-1809.13.patch, 
 YARN-1809.14.patch, YARN-1809.15-rebase.patch, YARN-1809.15.patch, 
 YARN-1809.16.patch, YARN-1809.17.patch, YARN-1809.17.rebase.patch, 
 YARN-1809.17.rebase.patch, YARN-1809.2.patch, YARN-1809.3.patch, 
 YARN-1809.4.patch, YARN-1809.5.patch, YARN-1809.5.patch, YARN-1809.6.patch, 
 YARN-1809.7.patch, YARN-1809.8.patch, YARN-1809.9.patch


 After YARN-953, the web-UI of generic history service is provide more 
 information than that of RM, the details about app attempt and container. 
 It's good to provide similar web-UIs, but retrieve the data from separate 
 source, i.e., RM cache and history store respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3287:
--
Labels: 2.6.1-candidate  (was: )

 TimelineClient kerberos authentication failure uses wrong login context.
 

 Key: YARN-3287
 URL: https://issues.apache.org/jira/browse/YARN-3287
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Daryn Sharp
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
 timeline.patch


 TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
 failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3725) App submission via REST API is broken in secure mode due to Timeline DT service address is empty

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3725:
--
Labels: 2.6.1-candidate  (was: )

 App submission via REST API is broken in secure mode due to Timeline DT 
 service address is empty
 

 Key: YARN-3725
 URL: https://issues.apache.org/jira/browse/YARN-3725
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.7.0
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
  Labels: 2.6.1-candidate
 Fix For: 2.7.1

 Attachments: YARN-3725.1.patch


 YARN-2971 changes TimelineClient to use the service address from Timeline DT 
 to renew the DT instead of configured address. This break the procedure of 
 submitting an YARN app via REST API in the secure mode.
 The problem is that service address is set by the client instead of the 
 server in Java code. REST API response is an encode token Sting, such that 
 it's so inconvenient to deserialize it and set the service address and 
 serialize it again. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3493) RM fails to come up with error Failed to load/recover state when mem settings are changed

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3493:
--
Labels: 2.6.1-candidate  (was: )

 RM fails to come up with error Failed to load/recover state when  mem 
 settings are changed
 

 Key: YARN-3493
 URL: https://issues.apache.org/jira/browse/YARN-3493
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
Reporter: Sumana Sathish
Assignee: Jian He
Priority: Critical
  Labels: 2.6.1-candidate
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3493.1.patch, YARN-3493.2.patch, YARN-3493.3.patch, 
 YARN-3493.4.patch, YARN-3493.5.patch, yarn-yarn-resourcemanager.log.zip


 RM fails to come up for the following case:
 1. Change yarn.nodemanager.resource.memory-mb and 
 yarn.scheduler.maximum-allocation-mb to 4000 in yarn-site.xml
 2. Start a randomtextwriter job with mapreduce.map.memory.mb=4000 in 
 background and wait for the job to reach running state
 3. Restore yarn-site.xml to have yarn.scheduler.maximum-allocation-mb to 2048 
 before the above job completes
 4. Restart RM
 5. RM fails to come up with the below error
 {code:title= RM error for Mem settings changed}
  - RM app submission failed in validating AM resource request for application 
 application_1429094976272_0008
 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
 resource request, requested memory  0, or requested memory  max configured, 
 requestedMemory=3072, maxMemory=2048
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:317)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:422)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1187)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
 at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:994)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1035)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1031)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1031)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1071)
 at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1208)
 2015-04-15 13:19:18,623 ERROR resourcemanager.ResourceManager 
 (ResourceManager.java:serviceStart(579)) - Failed to load/recover state
 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
 resource request, requested memory  0, or requested memory  max configured, 
 requestedMemory=3072, maxMemory=2048
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:317)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:422)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1187)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
 at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
 at 
 

[jira] [Updated] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2900:
--
Labels: 2.6.1-candidate  (was: )

 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
  Labels: 2.6.1-candidate
 Fix For: 2.7.1

 Attachments: YARN-2900-b2-2.patch, YARN-2900-b2.patch, 
 YARN-2900-branch-2.7.20150530.patch, YARN-2900.20150529.patch, 
 YARN-2900.20150530.patch, YARN-2900.20150530.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch, YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records

2015-07-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-3267:
--
Labels: 2.6.1-candidate  (was: )

 Timelineserver applies the ACL rules after applying the limit on the number 
 of records
 --

 Key: YARN-3267
 URL: https://issues.apache.org/jira/browse/YARN-3267
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Prakash Ramachandran
Assignee: Chang Li
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: YARN-3267.3.patch, YARN-3267.4.patch, YARN-3267.5.patch, 
 YARN_3267_V1.patch, YARN_3267_V2.patch, YARN_3267_WIP.patch, 
 YARN_3267_WIP1.patch, YARN_3267_WIP2.patch, YARN_3267_WIP3.patch


 While fetching the entities from timelineserver, the limit is applied on the 
 entities to be fetched from leveldb, the ACL filters are applied after this 
 (TimelineDataManager.java::getEntities). 
 this could mean that even if there are entities available which match the 
 query criteria, we could end up not getting any results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-07-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638192#comment-14638192
 ] 

Hitesh Shah commented on YARN-2513:
---

I am not sure if multiple uis work. 

Tried the following configs:

{code}
  property
nameyarn.timeline-service.ui-names/name
valuetezui,tezui2/value
  /property

  property
nameyarn.timeline-service.ui-web-path.tezui/name
value/tezui/value
  /property

  property
nameyarn.timeline-service.ui-on-disk-path.tezui/name
value/install//tez/ui//value
  /property


  property
nameyarn.timeline-service.ui-web-path.tezui2/name
value/tezui2/value
  /property

  property
nameyarn.timeline-service.ui-on-disk-path.tezui2/name
value/install/tez/tez-ui-0.8.0-SNAPSHOT.war/value
  /property
{code}

Logs:

{code}
2015-07-22 22:43:09,643 ERROR 
applicationhistoryservice.ApplicationHistoryServer - AHSWebApp failed to start.
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.startWebApp(ApplicationHistoryServer.java:295)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:114)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:162)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:171)
2015-07-22 22:43:09,644 INFO  service.AbstractService - Service 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
 failed in state STARTED; cause: 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: AHSWebApp failed to 
start.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: AHSWebApp failed to 
start.
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.startWebApp(ApplicationHistoryServer.java:305)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:114)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:162)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:171)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.startWebApp(ApplicationHistoryServer.java:295)
... 4 more
{code}


 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
  Labels: 2.6.1-candidate
 Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
 YARN-2513.v3.patch


 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-07-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2513:
--
Labels: 2.6.1-candidate  (was: )

 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
  Labels: 2.6.1-candidate
 Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
 YARN-2513.v3.patch


 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2890) MiniYarnCluster should turn on timeline service if configured to do so

2015-07-17 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2890:
--
Labels: 2.6.1-candidate  (was: )

 MiniYarnCluster should turn on timeline service if configured to do so
 --

 Key: YARN-2890
 URL: https://issues.apache.org/jira/browse/YARN-2890
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
  Labels: 2.6.1-candidate
 Fix For: 2.8.0

 Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, 
 YARN-2890.4.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
 YARN-2890.patch, YARN-2890.patch


 Currently the MiniMRYarnCluster does not consider the configuration value for 
 enabling timeline service before starting. The MiniYarnCluster should only 
 start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2859) ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster

2015-07-17 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2859:
--
Labels: 2.6.1-candidate  (was: )

 ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster
 --

 Key: YARN-2859
 URL: https://issues.apache.org/jira/browse/YARN-2859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah
Assignee: Zhijie Shen
Priority: Critical
  Labels: 2.6.1-candidate

 In mini cluster, a random port should be used. 
 Also, the config is not updated to the host that the process got bound to.
 {code}
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(722)) - MiniYARN ApplicationHistoryServer 
 address: localhost:10200
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(724)) - MiniYARN ApplicationHistoryServer 
 web address: 0.0.0.0:8188
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-867) Isolation of failures in aux services

2015-07-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622691#comment-14622691
 ] 

Hitesh Shah commented on YARN-867:
--

[~vinodkv] [~xgong] Is this still open or addressed elsewhere? 

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2015-06-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14571822#comment-14571822
 ] 

Hitesh Shah commented on YARN-2513:
---

+1 to making this available for ATS v1. Would be useful in various deployments .

 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
 YARN-2513.v3.patch


 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-900) YarnClientApplication uses composition to hold GetNewApplicationResponse instead of having a simpler flattened structure

2015-05-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523409#comment-14523409
 ] 

Hitesh Shah commented on YARN-900:
--

Probably too late to make this change now due to compatibility issues.  

 YarnClientApplication uses composition to hold GetNewApplicationResponse 
 instead of having a simpler flattened structure
 

 Key: YARN-900
 URL: https://issues.apache.org/jira/browse/YARN-900
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 Instead of YarnClientApplication have apis like getApplicationId, 
 getMaximumResourceCapability, etc - it currently holds a 
 GetNewApplicationResponse object. It might be simpler to get rid of 
 GetNewApplicationResponse and return a more well-suited object both at the 
 client as well from over the rpc layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-867) Isolation of failures in aux services

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-867:
-
Target Version/s: 2.8.0  (was: 2.3.0)

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-900) YarnClientApplication uses composition to hold GetNewApplicationResponse instead of having a simpler flattened structure

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-900.
--
Resolution: Not A Problem

 YarnClientApplication uses composition to hold GetNewApplicationResponse 
 instead of having a simpler flattened structure
 

 Key: YARN-900
 URL: https://issues.apache.org/jira/browse/YARN-900
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 Instead of YarnClientApplication have apis like getApplicationId, 
 getMaximumResourceCapability, etc - it currently holds a 
 GetNewApplicationResponse object. It might be simpler to get rid of 
 GetNewApplicationResponse and return a more well-suited object both at the 
 client as well from over the rpc layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2916) MiniYARNCluster should support enabling Timeline and new services via config

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-2916.
---
Resolution: Duplicate

 MiniYARNCluster should support enabling Timeline and new services via config
 

 Key: YARN-2916
 URL: https://issues.apache.org/jira/browse/YARN-2916
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0
Reporter: Hitesh Shah

 For any application to use the MiniYARNCluster without a shim, supporting new 
 components/services within the MiniYARNCluster should be done via config 
 based flags instead of additional params to the constructor. 
 Currently, for the same code to compile against 2.2/2.3/2.4/2.5/2.6, one 
 needs different invocations to MiniYARNCluster if timeline needs to be 
 enabled for versions of hadoop that support it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2840) Timeline should support creation of Domains where domainId is not provided by the user

2015-05-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523603#comment-14523603
 ] 

Hitesh Shah commented on YARN-2840:
---

[~zjshen] Is there any thinking of how acls will work with v2? Can you either 
move this into the v2 sub-tasks or create a new jira and close this out as a 
wont-fix assuming v1 will not be enhanced.

 Timeline should support creation of Domains where domainId is not provided by 
 the user
 --

 Key: YARN-2840
 URL: https://issues.apache.org/jira/browse/YARN-2840
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Hitesh Shah

 Current expectation is that the user has to come up with a unique domain id. 
 When using this with applications such as Pig/Hive/Oozie, these applications 
 will need to come up with a cluster-wide unique id to be able to create a 
 domain as domainIds need to be unique. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2833) Timeline Domains should be immutable by default

2015-05-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523602#comment-14523602
 ] 

Hitesh Shah commented on YARN-2833:
---

[~zjshen] Is there any thinking of how acls will work with v2? Can you either 
move this into the v2 sub-tasks or create a new jira and close this out as a 
wont-fix assuming v1 will not be enhanced.

 Timeline Domains should be immutable by default
 ---

 Key: YARN-2833
 URL: https://issues.apache.org/jira/browse/YARN-2833
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah

 In a general sense, when ACLs are defined for applications in various orgs 
 deploying Hadoop clusters, the ratio of unique ACLs to no. of jobs run on the 
 cluster should likely be a low value. In such a situation, it makes sense to 
 have a way to normalize the ACL set to generate an immutable domain id. 
 This should likely have performance and storage benefits. 
 There may be some cases where domains may need to be mutable. For that, I 
 propose a flag to be set when the domain is being created ( flag's default 
 value being immutable ).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-857) Localization failures should be available in container diagnostics

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-857:
-
Summary: Localization failures should be available in container diagnostics 
 (was: Errors when localizing end up with the localization failure not being 
seen by the NM)

 Localization failures should be available in container diagnostics
 --

 Key: YARN-857
 URL: https://issues.apache.org/jira/browse/YARN-857
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Attachments: YARN-857.1.patch, YARN-857.2.patch


 at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:978)
 Traced this down to DefaultExecutor which does not look at the exit code for 
 the localizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-857) Errors when localizing end up with the localization failure not being seen by the NM

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-857:
-
Priority: Critical  (was: Major)

 Errors when localizing end up with the localization failure not being seen by 
 the NM
 

 Key: YARN-857
 URL: https://issues.apache.org/jira/browse/YARN-857
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Attachments: YARN-857.1.patch, YARN-857.2.patch


 at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:978)
 Traced this down to DefaultExecutor which does not look at the exit code for 
 the localizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-857) Errors when localizing end up with the localization failure not being seen by the NM

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-857:
-
Target Version/s: 2.8.0  (was: 2.1.0-beta)

 Errors when localizing end up with the localization failure not being seen by 
 the NM
 

 Key: YARN-857
 URL: https://issues.apache.org/jira/browse/YARN-857
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-857.1.patch, YARN-857.2.patch


 at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:978)
 Traced this down to DefaultExecutor which does not look at the exit code for 
 the localizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-971) hadoop-yarn-api pom does not define a dependencies tag

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-971.
--
Resolution: Duplicate

 hadoop-yarn-api pom does not define a dependencies tag
 --

 Key: YARN-971
 URL: https://issues.apache.org/jira/browse/YARN-971
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
 Attachments: yarn-971-v1.patch


 As there is no dependencies tag defined in the pom, it inherits all the 
 dependencies defined in hadoop-yarn-project/pom.xml which contains a huge 
 list with dependencies like guice, netty, hdfs, jersey etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-931) Overlapping classes across hadoop-yarn-api and hadoop-yarn-common

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-931:
-
Target Version/s:   (was: 2.1.0-beta)

 Overlapping classes across hadoop-yarn-api and hadoop-yarn-common
 -

 Key: YARN-931
 URL: https://issues.apache.org/jira/browse/YARN-931
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 hadoop-yarn-common-3.0.0-SNAPSHOT.jar, hadoop-yarn-api-3.0.0-SNAPSHOT.jar 
 define 3 overlappping classes: 
 [WARNING]   - org.apache.hadoop.yarn.factories.package-info
 [WARNING]   - org.apache.hadoop.yarn.util.package-info
 [WARNING]   - org.apache.hadoop.yarn.factory.providers.package-info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2833) Timeline Domains should be immutable by default

2015-05-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523622#comment-14523622
 ] 

Hitesh Shah commented on YARN-2833:
---

[~zjshen] Please create a dup of this with the relevant info so that it is 
considered for the v2 design. I will close this one out. 

 Timeline Domains should be immutable by default
 ---

 Key: YARN-2833
 URL: https://issues.apache.org/jira/browse/YARN-2833
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah

 In a general sense, when ACLs are defined for applications in various orgs 
 deploying Hadoop clusters, the ratio of unique ACLs to no. of jobs run on the 
 cluster should likely be a low value. In such a situation, it makes sense to 
 have a way to normalize the ACL set to generate an immutable domain id. 
 This should likely have performance and storage benefits. 
 There may be some cases where domains may need to be mutable. For that, I 
 propose a flag to be set when the domain is being created ( flag's default 
 value being immutable ).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2840) Timeline should support creation of Domains where domainId is not provided by the user

2015-05-01 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523623#comment-14523623
 ] 

Hitesh Shah commented on YARN-2840:
---

[~zjshen] Please create a dup of this with the relevant info so that it is 
considered for the v2 design. I will close this one out.

 Timeline should support creation of Domains where domainId is not provided by 
 the user
 --

 Key: YARN-2840
 URL: https://issues.apache.org/jira/browse/YARN-2840
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Hitesh Shah

 Current expectation is that the user has to come up with a unique domain id. 
 When using this with applications such as Pig/Hive/Oozie, these applications 
 will need to come up with a cluster-wide unique id to be able to create a 
 domain as domainIds need to be unique. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2833) Timeline Domains should be immutable by default

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-2833.
---
Resolution: Won't Fix

 Timeline Domains should be immutable by default
 ---

 Key: YARN-2833
 URL: https://issues.apache.org/jira/browse/YARN-2833
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah

 In a general sense, when ACLs are defined for applications in various orgs 
 deploying Hadoop clusters, the ratio of unique ACLs to no. of jobs run on the 
 cluster should likely be a low value. In such a situation, it makes sense to 
 have a way to normalize the ACL set to generate an immutable domain id. 
 This should likely have performance and storage benefits. 
 There may be some cases where domains may need to be mutable. For that, I 
 propose a flag to be set when the domain is being created ( flag's default 
 value being immutable ).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2840) Timeline should support creation of Domains where domainId is not provided by the user

2015-05-01 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-2840.
---
Resolution: Won't Fix

 Timeline should support creation of Domains where domainId is not provided by 
 the user
 --

 Key: YARN-2840
 URL: https://issues.apache.org/jira/browse/YARN-2840
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Hitesh Shah

 Current expectation is that the user has to come up with a unique domain id. 
 When using this with applications such as Pig/Hive/Oozie, these applications 
 will need to come up with a cluster-wide unique id to be able to create a 
 domain as domainIds need to be unique. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3544) AM logs link missing in the RM UI for a completed app

2015-04-29 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520378#comment-14520378
 ] 

Hitesh Shah commented on YARN-3544:
---

Doesnt the NM log link redirect the log server after the logs have been 
aggregated? 

 AM logs link missing in the RM UI for a completed app 
 --

 Key: YARN-3544
 URL: https://issues.apache.org/jira/browse/YARN-3544
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Blocker
 Attachments: Screen Shot 2015-04-27 at 6.24.05 PM.png, 
 YARN-3544.1.patch


 AM log links should always be present ( for both running and completed apps).
 Likewise node info is also empty. This is usually quite crucial when trying 
 to debug where an AM was launched and a pointer to which NM's logs to look at 
 if the AM failed to launch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3544) AM logs link missing in the RM UI for a completed app

2015-04-29 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520379#comment-14520379
 ] 

Hitesh Shah commented on YARN-3544:
---

I meant redirect to the log server 

 AM logs link missing in the RM UI for a completed app 
 --

 Key: YARN-3544
 URL: https://issues.apache.org/jira/browse/YARN-3544
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Blocker
 Attachments: Screen Shot 2015-04-27 at 6.24.05 PM.png, 
 YARN-3544.1.patch


 AM log links should always be present ( for both running and completed apps).
 Likewise node info is also empty. This is usually quite crucial when trying 
 to debug where an AM was launched and a pointer to which NM's logs to look at 
 if the AM failed to launch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2859) ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster

2015-04-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515016#comment-14515016
 ] 

Hitesh Shah commented on YARN-2859:
---

[~zjshen] Are you planning to look at this? 

[~vinodkv] this will be a good candidate for 2.6.1 

 ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster
 --

 Key: YARN-2859
 URL: https://issues.apache.org/jira/browse/YARN-2859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah
Assignee: Zhijie Shen
Priority: Critical

 In mini cluster, a random port should be used. 
 Also, the config is not updated to the host that the process got bound to.
 {code}
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(722)) - MiniYARN ApplicationHistoryServer 
 address: localhost:10200
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(724)) - MiniYARN ApplicationHistoryServer 
 web address: 0.0.0.0:8188
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2092) Incompatible org.codehaus.jackson* dependencies when moving from 2.4.0 to 2.5.0-SNAPSHOT

2015-04-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515828#comment-14515828
 ] 

Hitesh Shah commented on YARN-2092:
---

Closing this out as it is no longer an issue for tez. 

 Incompatible org.codehaus.jackson* dependencies when moving from 2.4.0 to 
 2.5.0-SNAPSHOT
 

 Key: YARN-2092
 URL: https://issues.apache.org/jira/browse/YARN-2092
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 Came across this when trying to integrate with the timeline server. Using a 
 1.8.8 dependency of jackson works fine against 2.4.0 but fails against 
 2.5.0-SNAPSHOT which needs 1.9.13. This is in the scenario where the user 
 jars are first in the classpath.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2092) Incompatible org.codehaus.jackson* dependencies when moving from 2.4.0 to 2.5.0-SNAPSHOT

2015-04-27 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved YARN-2092.
---
Resolution: Not A Problem

 Incompatible org.codehaus.jackson* dependencies when moving from 2.4.0 to 
 2.5.0-SNAPSHOT
 

 Key: YARN-2092
 URL: https://issues.apache.org/jira/browse/YARN-2092
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 Came across this when trying to integrate with the timeline server. Using a 
 1.8.8 dependency of jackson works fine against 2.4.0 but fails against 
 2.5.0-SNAPSHOT which needs 1.9.13. This is in the scenario where the user 
 jars are first in the classpath.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2859) ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster

2015-04-27 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2859:
--
Target Version/s: 2.6.1, 2.8.0  (was: 2.8.0)

 ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster
 --

 Key: YARN-2859
 URL: https://issues.apache.org/jira/browse/YARN-2859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Hitesh Shah
Assignee: Zhijie Shen
Priority: Critical

 In mini cluster, a random port should be used. 
 Also, the config is not updated to the host that the process got bound to.
 {code}
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(722)) - MiniYARN ApplicationHistoryServer 
 address: localhost:10200
 2014-11-13 13:07:01,905 INFO  [main] server.MiniYARNCluster 
 (MiniYARNCluster.java:serviceStart(724)) - MiniYARN ApplicationHistoryServer 
 web address: 0.0.0.0:8188
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3544) AM logs link missing in the RM UI for a completed app

2015-04-24 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-3544:
-

 Summary: AM logs link missing in the RM UI for a completed app 
 Key: YARN-3544
 URL: https://issues.apache.org/jira/browse/YARN-3544
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Hitesh Shah


AM log links should always be present ( for both running and completed apps).

Likewise node info is also empty. This is usually quite crucial when trying to 
debug where an AM was launched and a pointer to which NM's logs to look at if 
the AM failed to launch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2976) Invalid docs for specifying yarn.nodemanager.docker-container-executor.exec-name

2015-04-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498470#comment-14498470
 ] 

Hitesh Shah commented on YARN-2976:
---

Typo: meant to say that it *does clash with the current config property name. 

 Invalid docs for specifying 
 yarn.nodemanager.docker-container-executor.exec-name
 

 Key: YARN-2976
 URL: https://issues.apache.org/jira/browse/YARN-2976
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Vijay Bhat
Priority: Minor

 Docs on 
 http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
  mention setting docker -H=tcp://0.0.0.0:4243 for 
 yarn.nodemanager.docker-container-executor.exec-name. 
 However, the actual implementation does a fileExists for the specified value. 
 Either the docs need to be fixed or the impl changed to allow relative paths 
 or commands with additional args



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2976) Invalid docs for specifying yarn.nodemanager.docker-container-executor.exec-name

2015-04-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498466#comment-14498466
 ] 

Hitesh Shah commented on YARN-2976:
---

The latter definitely makes more sense but it does not clash with the config 
property name. Maybe, we can deprecate the old one in favor  of the newer 
config property which supports a flexible command ( relative path, args, etc)? 
For the old/cuurent one, we can fix the docs to say that it does a file exists 
check and does not support additional args? 

 Invalid docs for specifying 
 yarn.nodemanager.docker-container-executor.exec-name
 

 Key: YARN-2976
 URL: https://issues.apache.org/jira/browse/YARN-2976
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Vijay Bhat
Priority: Minor

 Docs on 
 http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
  mention setting docker -H=tcp://0.0.0.0:4243 for 
 yarn.nodemanager.docker-container-executor.exec-name. 
 However, the actual implementation does a fileExists for the specified value. 
 Either the docs need to be fixed or the impl changed to allow relative paths 
 or commands with additional args



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2976) Invalid docs for specifying yarn.nodemanager.docker-container-executor.exec-name

2015-04-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498493#comment-14498493
 ] 

Hitesh Shah commented on YARN-2976:
---

Agreed. The newer one would take precedence.

 Invalid docs for specifying 
 yarn.nodemanager.docker-container-executor.exec-name
 

 Key: YARN-2976
 URL: https://issues.apache.org/jira/browse/YARN-2976
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Vijay Bhat
Priority: Minor

 Docs on 
 http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
  mention setting docker -H=tcp://0.0.0.0:4243 for 
 yarn.nodemanager.docker-container-executor.exec-name. 
 However, the actual implementation does a fileExists for the specified value. 
 Either the docs need to be fixed or the impl changed to allow relative paths 
 or commands with additional args



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2890) MiniYarnCluster should turn on timeline service if configured to do so

2015-04-08 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-2890:
--
Summary: MiniYarnCluster should turn on timeline service if configured to 
do so  (was: MiniMRYarnCluster should turn on timeline service if configured to 
do so)

 MiniYarnCluster should turn on timeline service if configured to do so
 --

 Key: YARN-2890
 URL: https://issues.apache.org/jira/browse/YARN-2890
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, 
 YARN-2890.4.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
 YARN-2890.patch, YARN-2890.patch


 Currently the MiniMRYarnCluster does not consider the configuration value for 
 enabling timeline service before starting. The MiniYarnCluster should only 
 start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2015-04-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486033#comment-14486033
 ] 

Hitesh Shah commented on YARN-2890:
---

+1. Thanks for patiently addressing review comments. Committing shortly. 

 MiniMRYarnCluster should turn on timeline service if configured to do so
 

 Key: YARN-2890
 URL: https://issues.apache.org/jira/browse/YARN-2890
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, 
 YARN-2890.4.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
 YARN-2890.patch, YARN-2890.patch


 Currently the MiniMRYarnCluster does not consider the configuration value for 
 enabling timeline service before starting. The MiniYarnCluster should only 
 start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2015-04-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393445#comment-14393445
 ] 

Hitesh Shah commented on YARN-2890:
---

Sorry did not check the last update. 

Minor nit: Some of the test changes in TestMRTimelineEventHandling probably 
need to belong in TestMiniYarnCluster if that exists as yarn timeline flag 
behaviour checks should ideally be tested in yarn code and not MR code. 





 MiniMRYarnCluster should turn on timeline service if configured to do so
 

 Key: YARN-2890
 URL: https://issues.apache.org/jira/browse/YARN-2890
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, 
 YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
 YARN-2890.patch


 Currently the MiniMRYarnCluster does not consider the configuration value for 
 enabling timeline service before starting. The MiniYarnCluster should only 
 start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-31 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388817#comment-14388817
 ] 

Hitesh Shah commented on YARN-3304:
---

[~kasha] 3.0.0 is a major release. I would assume all deprecated apis should be 
removed. Given the length of time after which new major releases come into 
existence, there would be no point of deprecating apis if they are not removed 
in the next major release. 



 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-appendix-v2.patch, YARN-3304-appendix.patch, 
 YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304-v4-boolean-way.patch, 
 YARN-3304-v4-negative-way-MR.patch, YARN-3304-v4-negtive-value-way.patch, 
 YARN-3304-v6-no-rename.patch, YARN-3304-v6-with-rename.patch, 
 YARN-3304-v7.patch, YARN-3304-v8.patch, YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387345#comment-14387345
 ] 

Hitesh Shah commented on YARN-3304:
---

Sigh. This broke compatibility again. Was there a reason why APIs were just 
removed/renamed instead of some form of supporting 2 APIs with a way to check 
at runtime whether the plugin supports old or new APIs? ( and the old ones 
being deprecated ).

{code}
public int getValueFromOldAPI();
public int getValueFromNewAPI();
public boolean supportsNewAPI() { return false; }

   
   if (supportsNewAPI()) {
   getValueFromNewAPI();
   } else {
   getValueFromOldAPI();
   }

  ...
{code}

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, 
 YARN-3304-v4-boolean-way.patch, YARN-3304-v4-negative-way-MR.patch, 
 YARN-3304-v4-negtive-value-way.patch, YARN-3304-v6-no-rename.patch, 
 YARN-3304-v6-with-rename.patch, YARN-3304-v7.patch, YARN-3304-v8.patch, 
 YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-30 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reopened YARN-3304:
---

Also, FWIW, ResourceCalculatorProcessTree is a public API. Re-opening as this 
breaks Tez ( again ).

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, 
 YARN-3304-v4-boolean-way.patch, YARN-3304-v4-negative-way-MR.patch, 
 YARN-3304-v4-negtive-value-way.patch, YARN-3304-v6-no-rename.patch, 
 YARN-3304-v6-with-rename.patch, YARN-3304-v7.patch, YARN-3304-v8.patch, 
 YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387810#comment-14387810
 ] 

Hitesh Shah commented on YARN-3304:
---

https://issues.apache.org/jira/browse/YARN-3297 is probably relevant too. 

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, 
 YARN-3304-v4-boolean-way.patch, YARN-3304-v4-negative-way-MR.patch, 
 YARN-3304-v4-negtive-value-way.patch, YARN-3304-v6-no-rename.patch, 
 YARN-3304-v6-with-rename.patch, YARN-3304-v7.patch, YARN-3304-v8.patch, 
 YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387808#comment-14387808
 ] 

Hitesh Shah commented on YARN-3304:
---

[~aw] Thanks for putting it so bluntly. You may wish to look at the related 
jiras such as https://issues.apache.org/jira/browse/YARN-3296.

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, 
 YARN-3304-v4-boolean-way.patch, YARN-3304-v4-negative-way-MR.patch, 
 YARN-3304-v4-negtive-value-way.patch, YARN-3304-v6-no-rename.patch, 
 YARN-3304-v6-with-rename.patch, YARN-3304-v7.patch, YARN-3304-v8.patch, 
 YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387656#comment-14387656
 ] 

Hitesh Shah commented on YARN-3304:
---

Forgot to add, we use this for resource monitoring for a task within a 
container. Given that we run multiple tasks within the same container, this api 
stability becomes more important as YARN cannot provide the resource monitoring 
functionality at the granularity that we need.

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, 
 YARN-3304-v4-boolean-way.patch, YARN-3304-v4-negative-way-MR.patch, 
 YARN-3304-v4-negtive-value-way.patch, YARN-3304-v6-no-rename.patch, 
 YARN-3304-v6-with-rename.patch, YARN-3304-v7.patch, YARN-3304-v8.patch, 
 YARN-3304.patch, yarn-3304-5.patch


 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   >