[jira] [Commented] (YARN-1758) MiniYARNCluster broken post YARN-1666

2014-03-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917511#comment-13917511
 ] 

Hitesh Shah commented on YARN-1758:
---

[~xgong] Have you given though to YARN-1759 as part of this fix?

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-1758.1.patch, YARN-1758.2.patch


 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2014-03-02 Thread Qi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917585#comment-13917585
 ] 

Qi Zhang commented on YARN-1021:


Hi @Wei Yan. I am trying to use SLS but always meet with the following 
exception. Can you tell me what is the reason? Thank you!

-bash-3.2$ sudo sh share/hadoop/tools/sls/bin/slsrun.sh 
--input-rumen=share/hadoop/tools/sls/sample-data/2jobs2min-rumen-jh.json 
--output-dir=share/hadoop/tools/sls/sample_output
log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library 
/usr/local/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0 which might have disabled 
stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c libfile', 
or link it with '-z noexecstack'.
java.lang.NullPointerException
at org.apache.hadoop.yarn.sls.web.SLSWebApp.init(SLSWebApp.java:82)
at 
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.initMetrics(ResourceSchedulerWrapper.java:463)
at 
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.setConf(ResourceSchedulerWrapper.java:162)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createScheduler(ResourceManager.java:230)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:355)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:775)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:197)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:163)
at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:137)
at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:524)
Exception in thread pool-2-thread-72 java.lang.NullPointerException
at 
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addAMRuntime(ResourceSchedulerWrapper.java:721)
at 
org.apache.hadoop.yarn.sls.appmaster.AMSimulator.lastStep(AMSimulator.java:196)
at 
org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.lastStep(MRAMSimulator.java:390)
at 
org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Exception in thread pool-2-thread-98 java.lang.NullPointerException
at 
org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addAMRuntime(ResourceSchedulerWrapper.java:721)
at 
org.apache.hadoop.yarn.sls.appmaster.AMSimulator.lastStep(AMSimulator.java:196)
at 
org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.lastStep(MRAMSimulator.java:390)
at 
org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Fix For: 2.3.0

 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource 

[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2014-03-02 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917634#comment-13917634
 ] 

Wei Yan commented on YARN-1021:
---

[~qzhang90]. Check the resource simulate.info.html.template. It look the sls 
cannot find it.
And step into the sls directory and try again. cd share/hadoop/tools/sls; 
bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json 
--output-dir=sample_output.


 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Fix For: 2.3.0

 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2014-03-02 Thread Qi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917641#comment-13917641
 ] 

Qi Zhang commented on YARN-1021:


Wei Yan. Thank you for your suggestion, it solves the problem! 
Actually, I tried to run the slsrun.sh from many other directories expect 
share/hadoop/tools/sls. I think it can be more straightforward if slsrun.sh can 
be executed from any path.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Fix For: 2.3.0

 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1761) RMAdminCLI should check whether HA is enabled before executes transitionToActive/transitionToStandby

2014-03-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1761:


Attachment: YARN-1761.1.patch

 RMAdminCLI should check whether HA is enabled before executes 
 transitionToActive/transitionToStandby
 

 Key: YARN-1761
 URL: https://issues.apache.org/jira/browse/YARN-1761
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1761.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1761) RMAdminCLI should check whether HA is enabled before executes transitionToActive/transitionToStandby

2014-03-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917648#comment-13917648
 ] 

Xuan Gong commented on YARN-1761:
-

create a patch to use ConfigurationProvider to load the Configuration, and 
check whether RM_HA is enabled or not

 RMAdminCLI should check whether HA is enabled before executes 
 transitionToActive/transitionToStandby
 

 Key: YARN-1761
 URL: https://issues.apache.org/jira/browse/YARN-1761
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1761.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1761) RMAdminCLI should check whether HA is enabled before executes transitionToActive/transitionToStandby

2014-03-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917660#comment-13917660
 ] 

Hadoop QA commented on YARN-1761:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12632168/YARN-1761.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3227//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3227//console

This message is automatically generated.

 RMAdminCLI should check whether HA is enabled before executes 
 transitionToActive/transitionToStandby
 

 Key: YARN-1761
 URL: https://issues.apache.org/jira/browse/YARN-1761
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1761.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1759) Configuration settings can potentially disappear post YARN-1666

2014-03-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917755#comment-13917755
 ] 

Xuan Gong commented on YARN-1759:
-

Could you explain more on why do you think this will cause any issues. Because 
I do not think there will be any.

We have two ConfigurationProvider right now. 
The first one is LocalConfigurationProvider. By using this, we will load local 
core-site.xml and local yarn-site.xml twice. I think this should be fine. It 
will not change any property values.

The other one is FileSystemBasedConfigurationProvider. We will load local 
core-site.xml and local yarn-site.xml first as the bootstrap configurations, 
then we load the remote Configurations to over-write everything. And I think if 
we choose to use FileSystemBasedConfigurationProvider, we should upload the 
configurations that we want to use to remote FileSystems. 


 Configuration settings can potentially disappear post YARN-1666
 ---

 Key: YARN-1759
 URL: https://issues.apache.org/jira/browse/YARN-1759
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 By implicitly loading core-site and yarn-site again in the RM::serviceInit(), 
 some configs may be unintentionally overridden.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1758) MiniYARNCluster broken post YARN-1666

2014-03-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917757#comment-13917757
 ] 

Xuan Gong commented on YARN-1758:
-

bq. Xuan Gong Have you given though to YARN-1759 as part of this fix?

I have some comments in YARN-1759. We can start the discussion there. These two 
tickets are not much related.  

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-1758.1.patch, YARN-1758.2.patch


 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1775) Create SMAPBasedProcessTree to get PSS information

2014-03-02 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created YARN-1775:
--

 Summary: Create SMAPBasedProcessTree to get PSS information
 Key: YARN-1775
 URL: https://issues.apache.org/jira/browse/YARN-1775
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Rajesh Balamohan
Priority: Minor


Create SMAPBasedProcessTree (by extending ProcfsBasedProcessTree), which will 
make use of PSS for computing the memory usage. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1389) ApplicationClientProtocol and ApplicationHistoryProtocol should expose analog APIs

2014-03-02 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-1389:


Attachment: YARN-1389-2.patch

attaching patch with compilation fix

Thanks,
Mayank

 ApplicationClientProtocol and ApplicationHistoryProtocol should expose analog 
 APIs
 --

 Key: YARN-1389
 URL: https://issues.apache.org/jira/browse/YARN-1389
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: YARN-1389-1.patch, YARN-1389-2.patch


 As we plan to have the APIs in ApplicationHistoryProtocol to expose the 
 reports of *finished* application attempts and containers, we should do the 
 same for ApplicationClientProtocol, which will return the reports of 
 *running* attempts and containers.
 Later on, we can improve YarnClient to direct the query of running instance 
 to ApplicationClientProtocol, while that of finished instance to 
 ApplicationHistoryProtocol, making it transparent to the users.



--
This message was sent by Atlassian JIRA
(v6.2#6252)