date:20150701


 [ 
https://issues.apache.org/jira/browse/YARN-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3874:
---
Issue Type: Sub-task  (was: Bug)
Parent: YARN-2928

 Combine FS Reader and Writer Implementations
 

 Key: YARN-3874
 URL: https://issues.apache.org/jira/browse/YARN-3874
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Saxena

 Combine FS Reader and Writer Implementations and make them consistent with 
 each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3874) Combine FS Reader and Writer Implementations

Varun Saxena created YARN-3874:
--

 Summary: Combine FS Reader and Writer Implementations
 Key: YARN-3874
 URL: https://issues.apache.org/jira/browse/YARN-3874
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Saxena


Combine FS Reader and Writer Implementations and make them consistent with each 
other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-07-01 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609769#comment-14609769
 ] 

zhihai xu commented on YARN-3798:
-

thanks for the new patch [~ozawa]!
sync() is asynchronous sync. The result is returned from AsyncCallback. Should 
we wait for the result from AsyncCallback to make sure the sync operation is 
done at ZooKeeper server? Should we also {{createConnection}} for 
SessionMovedException similar as SessionExpiredException to avoid regression? 
since ZOOKEEPER-2219 is not fixed yet. Should we sync RM ZK root path 
{{zkRootNodePath}} for safety purposes?

 ZKRMStateStore shouldn't create new session without occurrance of 
 SESSIONEXPIED
 ---

 Key: YARN-3798
 URL: https://issues.apache.org/jira/browse/YARN-3798
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Assignee: Varun Saxena
Priority: Blocker
 Attachments: RM.log, YARN-3798-2.7.002.patch, 
 YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.003.patch, 
 YARN-3798-branch-2.7.patch


 RM going down with NoNode exception during create of znode for appattempt
 *Please find the exception logs*
 {code}
 2015-06-09 10:09:44,732 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
 ZKRMStateStore Session connected
 2015-06-09 10:09:44,732 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
 ZKRMStateStore Session restored
 2015-06-09 10:09:44,886 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
 Exception while executing a ZK operation.
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
   at java.lang.Thread.run(Thread.java:745)
 2015-06-09 10:09:44,887 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
 out ZK retries. Giving up!
 2015-06-09 10:09:44,887 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
 updating appAttempt: appattempt_1433764310492_7152_01
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
   at

[jira] [Assigned] (YARN-2953) TestWorkPreservingRMRestart fails on trunk


 [ 
https://issues.apache.org/jira/browse/YARN-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nijel reassigned YARN-2953:
---

Assignee: nijel

 TestWorkPreservingRMRestart fails on trunk
 --

 Key: YARN-2953
 URL: https://issues.apache.org/jira/browse/YARN-2953
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith Sharma K S
Assignee: nijel

 Running 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
 Tests run: 36, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 337.034 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
 testReleasedContainerNotRecovered[0](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)
   Time elapsed: 30.031 sec   ERROR!
 java.lang.Exception: test timed out after 3 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:131)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:670)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testReleasedContainerNotRecovered(TestWorkPreservingRMRestart.java:850)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID

2015-07-01 Thread Bibin A Chundatt (JIRA)

Bibin A Chundatt created YARN-3875:
--

 Summary: FSSchedulerNode#reserveResource() not printing 
applicationID
 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor


FSSchedulerNode#reserveResource()

{code}
  LOG.info(Updated reserved container  + 
  container.getContainer().getId() +  on node  + 
  this +  for application  + application);
} else {
  LOG.info(Reserved container  + container.getContainer().getId() + 
   on node  + this +  for application  + application);
}
{code}

update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3874) Combine FS Reader and Writer Implementations


 [ 
https://issues.apache.org/jira/browse/YARN-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3874:
--

Assignee: Varun Saxena

 Combine FS Reader and Writer Implementations
 

 Key: YARN-3874
 URL: https://issues.apache.org/jira/browse/YARN-3874
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Saxena
Assignee: Varun Saxena

 Combine FS Reader and Writer Implementations and make them consistent with 
 each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2953) TestWorkPreservingRMRestart fails on trunk


[ 
https://issues.apache.org/jira/browse/YARN-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609771#comment-14609771
 ] 

nijel commented on YARN-2953:
-

Hi [~rohithsharma]
This test cases is passing in recent code and i see the time out is increased ( 
@Test (timeout = 5)). This happened on the following check-in
{code}
Revision: 5f57b904f550515693d93a2959e663b0d0260696
Author: Jian He jia...@apache.org
Date: 31-12-2014 05:05:45
Message:
YARN-2492. Added node-labels page on RM web UI. Contributed by Wangda Tan
{code}
Can you please validate this issue ? 


 TestWorkPreservingRMRestart fails on trunk
 --

 Key: YARN-2953
 URL: https://issues.apache.org/jira/browse/YARN-2953
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith Sharma K S

 Running 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
 Tests run: 36, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 337.034 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
 testReleasedContainerNotRecovered[0](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)
   Time elapsed: 30.031 sec   ERROR!
 java.lang.Exception: test timed out after 3 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:131)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:670)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testReleasedContainerNotRecovered(TestWorkPreservingRMRestart.java:850)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


 [ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nijel updated YARN-3830:

Attachment: YARN-3830_4.patch

Thanks [~devaraj.k] for the suggestion
Updated patch with test case
Please review

 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: nijel
Assignee: nijel
 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID

2015-07-01 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3875:
---
Attachment: 0001-YARN-3875.patch

Currently logs are shown as below

{code}
Reserved container container_e08_1435660809935_0008_01_000670 on node host: 
host-10-19-92-117:64318 #containers=6 available=memory:0, vCores:10 
used=memory:3072, vCores:6 for application 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt@1c1a7108
{code}


Patch uploaded please review 

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: YARN-2681.patch

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.02.patch, HADOOP-2681.patch, 
 HADOOP-2681.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png, 
 YARN-2681.patch


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: HADOOP-2681.02.patch)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HdfsTrafficControl_UML.png, Traffic Control Design.png, YARN-2681.patch


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7

2015-07-01 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610773#comment-14610773
 ] 

Varun Vasudev commented on YARN-2194:
-

I tested it with multiple local dirs as well. Any chance you can attach the 
yarn-site.xml you used(or send it to me offline)?

 Cgroups cease to work in RHEL7
 --

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Critical
 Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch, 
 YARN-2194-4.patch, YARN-2194-5.patch, YARN-2194-6.patch


 In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the 
 controller name leads to container launch failure. 
 RHEL7 deprecates libcgroup and recommends the user of systemd. However, 
 systemd has certain shortcomings as identified in this JIRA (see comments). 
 This JIRA only fixes the failure, and doesn't try to use systemd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7

2015-07-01 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610821#comment-14610821
 ] 

Sidharta Seethana commented on YARN-2194:
-

[~kasha] , I have run into such issues when I forgot to rebuild 
container-executor (requires a different maven profile to be used). So, a shot 
in the dark : did you re-build the container-executor binary? :)

 Cgroups cease to work in RHEL7
 --

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Critical
 Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch, 
 YARN-2194-4.patch, YARN-2194-5.patch, YARN-2194-6.patch


 In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the 
 controller name leads to container launch failure. 
 RHEL7 deprecates libcgroup and recommends the user of systemd. However, 
 systemd has certain shortcomings as identified in this JIRA (see comments). 
 This JIRA only fixes the failure, and doesn't try to use systemd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610565#comment-14610565
 ] 

Hudson commented on YARN-3827:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #243 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/243/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3841) [Storage implementation] Create HDFS backing storage implementation for ATS writes

2015-07-01 Thread Tsuyoshi Ozawa (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3841:
-
Attachment: YARN-3841.001.patch

Attaching a first patch to convert the implementation into FileSystem based one.

[~sjlee0] [~zjshen] could you take a look?

 [Storage implementation] Create HDFS backing storage implementation for ATS 
 writes
 --

 Key: YARN-3841
 URL: https://issues.apache.org/jira/browse/YARN-3841
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
 Attachments: YARN-3841.001.patch


 HDFS backing storage is useful for following scenarios.
 1. For Hadoop clusters which don't run HBase.
 2. For fallback from HBase when HBase cluster is temporary unavailable. 
 Quoting ATS design document of YARN-2928:
 {quote}
 In the case the HBase
 storage is not available, the plugin should buffer the writes temporarily 
 (e.g. HDFS), and flush
 them once the storage comes back online. Reading and writing to hdfs as the 
 the backup storage
 could potentially use the HDFS writer plugin unless the complexity of 
 generalizing the HDFS
 writer plugin for this purpose exceeds the benefits of reusing it here.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers


[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610626#comment-14610626
 ] 

Varun Saxena commented on YARN-3051:


Any reason metrics and events in TimelineEntity are stored in a set ? A map 
will make some operations easier and optimal in case of FS implementation

 [Storage abstraction] Create backing storage read interface for ATS readers
 ---

 Key: YARN-3051
 URL: https://issues.apache.org/jira/browse/YARN-3051
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3051-YARN-2928.003.patch, 
 YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
 YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
 YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
 YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
 YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
 YARN-3051.wip.patch, YARN-3051_temp.patch


 Per design in YARN-2928, create backing storage read interface that can be 
 implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610571#comment-14610571
 ] 

Hudson commented on YARN-3823:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #243 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/243/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610569#comment-14610569
 ] 

Hudson commented on YARN-3768:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #243 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/243/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Target Version/s: 2.7.2
   Fix Version/s: (was: 2.7.2)
  2.7.0

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, Traffic Control 
 Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept:
 http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3877) YarnClientImpl.submitApplication swallows exceptions


 [ 
https://issues.apache.org/jira/browse/YARN-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3877:
--

Assignee: Varun Saxena

 YarnClientImpl.submitApplication swallows exceptions
 

 Key: YARN-3877
 URL: https://issues.apache.org/jira/browse/YARN-3877
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 2.7.2
Reporter: Steve Loughran
Assignee: Varun Saxena
Priority: Minor

 When {{YarnClientImpl.submitApplication}} spins waiting for the application 
 to be accepted, any interruption during its Sleep() calls are logged and 
 swallowed.
 this makes it hard to interrupt the thread during shutdown. Really it should 
 throw some form of exception and let the caller deal with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3508) Preemption processing occuring on the main RM dispatcher


[ 
https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610668#comment-14610668
 ] 

Varun Saxena commented on YARN-3508:


[~leftnoteasy], the timed out test {{TestNodeLabelContainerAllocation}} is 
unrelated and will be handled by YARN-3848.
Maybe you can review that JIRA too :)

 Preemption processing occuring on the main RM dispatcher
 

 Key: YARN-3508
 URL: https://issues.apache.org/jira/browse/YARN-3508
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3508.002.patch, YARN-3508.01.patch, 
 YARN-3508.03.patch, YARN-3508.04.patch, YARN-3508.05.patch, YARN-3508.06.patch


 We recently saw the RM for a large cluster lag far behind on the 
 AsyncDispacher event queue.  The AsyncDispatcher thread was consistently 
 blocked on the highly-contended CapacityScheduler lock trying to dispatch 
 preemption-related events for RMContainerPreemptEventDispatcher.  Preemption 
 processing should occur on the scheduler event dispatcher thread or a 
 separate thread to avoid delaying the processing of other events in the 
 primary dispatcher queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3528) Tests with 12345 as hard-coded port break jenkins


[ 
https://issues.apache.org/jira/browse/YARN-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610715#comment-14610715
 ] 

Varun Saxena commented on YARN-3528:


[~brahmareddy], looked at your code. Its somewhat repetitive. IIUC, what you 
are trying to achieve here is first try a passed port and then randomize. You 
can change the code as under. Disclaimer : Havent tested it but should work.
{code}
  public static int getPort(int port, int retries) throws IOException {
Random rand = new Random();
int tryPort = port;
int tries = 0;
while (true) {
  if (tries  0) {  
tryPort = port + rand.nextInt(65535 - port);
  }
  LOG.info(Using port  + tryPort);
  try (ServerSocket s = new ServerSocket(tryPort)) {
return tryPort;
  } catch (IOException e) {
tries++;
if (tries = retries) {
  LOG.info(Port is already in use; giving up);
  throw e;
} else {
  LOG.info(Port is already in use; trying again);
}
  }
}
  }
{code}

 Tests with 12345 as hard-coded port break jenkins
 -

 Key: YARN-3528
 URL: https://issues.apache.org/jira/browse/YARN-3528
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: ASF Jenkins
Reporter: Steve Loughran
Assignee: Brahma Reddy Battula
Priority: Blocker
  Labels: test
 Attachments: YARN-3528-002.patch, YARN-3528.patch


 A lot of the YARN tests have hard-coded the port 12345 for their services to 
 come up on.
 This makes it impossible to have scheduled or precommit tests to run 
 consistently on the ASF jenkins hosts. Instead the tests fail regularly and 
 appear to get ignored completely.
 A quick grep of 12345 shows up many places in the test suite where this 
 practise has developed.
 * All {{BaseContainerManagerTest}} subclasses
 * {{TestNodeManagerShutdown}}
 * {{TestContainerManager}}
 + others
 This needs to be addressed through portscanning and dynamic port allocation. 
 Please can someone do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2004) Priority scheduling support in Capacity scheduler

2015-07-01 Thread Sunil G (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sunil G updated YARN-2004:
--
Attachment: 0010-YARN-2004.patch

Hi [~leftnoteasy]

Thank you very much for sharing the comments. I have updated the patch
addressing the comments.
- applicationComparator is still kept here. I raised a ticket to remove it,
once that is done, i will rebase this patch.
- {{FairScheduler#getAppWeight}} I understood the idea very much. I feel we can
have this later as an improvement once the base version is done. How do u feel?

Priority scheduling support in Capacity scheduler
-

Key: YARN-2004
URL: https://issues.apache.org/jira/browse/YARN-2004
Project: Hadoop YARN
Issue Type: Sub-task
Components: capacityscheduler
Reporter: Sunil G
Assignee: Sunil G
Attachments: 0001-YARN-2004.patch, 0002-YARN-2004.patch,
0003-YARN-2004.patch, 0004-YARN-2004.patch, 0005-YARN-2004.patch,
0006-YARN-2004.patch, 0007-YARN-2004.patch, 0008-YARN-2004.patch,
0009-YARN-2004.patch, 0010-YARN-2004.patch

Based on the priority of the application, Capacity Scheduler should be able
to give preference to application while doing scheduling.
ComparatorFiCaSchedulerApp applicationComparator can be changed as below.

1.Check for Application priority. If priority is available, then return
the highest priority job.
2.Otherwise continue with existing logic such as App ID comparison and
then TimeStamp comparison.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: HDFS-2681.02.patch)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HDFS-2681.02.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: HDFS-2681.02.patch

Merged with the latest trunk

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HDFS-2681.02.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-07-01 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610789#comment-14610789
 ] 

Wangda Tan commented on YARN-3848:
--

[~varun_saxena], could you take a look at my previous comment? I want to 
understand if this is the correct fix.

Thanks,

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3848.01.patch, test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: HADOOP-2681.02.patch

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.02.patch, HADOOP-2681.patch, 
 HADOOP-2681.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: HDFS-2681.02.patch)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.02.patch, HADOOP-2681.patch, 
 HADOOP-2681.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: HADOOP-2681.patch)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HdfsTrafficControl_UML.png, Traffic Control Design.png, 
 YARN-2681.patch


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: HADOOP-2681.patch)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HdfsTrafficControl_UML.png, Traffic Control Design.png, 
 YARN-2681.patch


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3508) Preemption processing occuring on the main RM dispatcher

2015-07-01 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610788#comment-14610788
 ] 

Wangda Tan commented on YARN-3508:
--

Thanks for update, [~varun_saxena].

Checkstyle is fine to me, the patch generally looks good. [~jianhe]/[~jlowe], 
could you take a look also?

 Preemption processing occuring on the main RM dispatcher
 

 Key: YARN-3508
 URL: https://issues.apache.org/jira/browse/YARN-3508
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3508.002.patch, YARN-3508.01.patch, 
 YARN-3508.03.patch, YARN-3508.04.patch, YARN-3508.05.patch, YARN-3508.06.patch


 We recently saw the RM for a large cluster lag far behind on the 
 AsyncDispacher event queue.  The AsyncDispatcher thread was consistently 
 blocked on the highly-contended CapacityScheduler lock trying to dispatch 
 preemption-related events for RMContainerPreemptEventDispatcher.  Preemption 
 processing should occur on the scheduler event dispatcher thread or a 
 separate thread to avoid delaying the processing of other events in the 
 primary dispatcher queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3849) Too much of preemption activity causing continuos killing of containers across queues

2015-07-01 Thread Wangda Tan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610815#comment-14610815
]

Wangda Tan commented on YARN-3849:
--

Thanks [~sunilg],
Some comments:

Some comments:
1) It seems we don't need useDominantResourceCalculator/rcDefault/rcDominant in
TestP..Policy, pass a boolean parameter to buildPolicy should be enough, you
can also overload a buildPolicy to avoid too much changes.

2) testPreemptionWithVCoreResource seems not correct, root.used != A.used +
b.used

3) TestP..PolicyFroNodePartitions:
One comments is wrong:
{code}
+ (1,1:2,n1,x,20,false); + // 80 * x in n1
b\t // app4 in b
+ (1,1:2,n2,,80,false); // 20 default in n2
{code}
It should be 20 * x and 80 default

4) It seems TestP..PolicyFroNodePartitions setting for DRC is missing, could
you check?

Too much of preemption activity causing continuos killing of containers
across queues
-

Key: YARN-3849
URL: https://issues.apache.org/jira/browse/YARN-3849
Project: Hadoop YARN
Issue Type: Bug
Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical
Attachments: 0001-YARN-3849.patch, 0002-YARN-3849.patch

Two queues are used. Each queue has given a capacity of 0.5. Dominant
Resource policy is used.
1. An app is submitted in QueueA which is consuming full cluster capacity
2. After submitting an app in QueueB, there are some demand and invoking
preemption in QueueA
3. Instead of killing the excess of 0.5 guaranteed capacity, we observed that
all containers other than AM is getting killed in QueueA
4. Now the app in QueueB is trying to take over cluster with the current free
space. But there are some updated demand from the app in QueueA which lost
its containers earlier, and preemption is kicked in QueueB now.
Scenario in step 3 and 4 continuously happening in loop. Thus none of the
apps are completing.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)


[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609868#comment-14609868
 ] 

Mohammad Shahid Khan commented on YARN-3840:


Hi Devaraj K , 

Please ignore the first patch. in the current patch have taken the nodemanager 
web ui issue as well.
The current patch is using the natural sort algorithm of the natural,js a 
plugin used for datatable to sort the data.
The natural sort plugin  
https://github.com/DataTables/Plugins/blob/1.10.7/sorting/natural.js  is 
having the MIT license.
As per MIT license we can redistribute the code but we have to keep the license 
header.
The haddop patch verification tool does not allow the auther info
As of now i have not removed the  @author tag from the path file.
Please help me to address this issue.


 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: LINTE
Assignee: Mohammad Shahid Khan
 Attachments: RMApps.png, YARN-3840-1.patch, YARN-3840-2.patch


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


[ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609965#comment-14609965
 ] 

Hadoop QA commented on YARN-3830:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 47s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 45s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 56s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 38s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743024/YARN-3830_4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7405c59 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8403/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8403/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8403/console |


This message was automatically generated.

 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: nijel
Assignee: nijel
 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609992#comment-14609992
 ] 

Hudson commented on YARN-3823:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #975 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/975/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609990#comment-14609990
 ] 

Hudson commented on YARN-3768:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #975 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/975/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* hadoop-yarn-project/CHANGES.txt


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609985#comment-14609985
 ] 

Hudson commented on YARN-3827:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #975 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/975/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610373#comment-14610373
 ] 

Hudson commented on YARN-3768:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610356#comment-14610356
 ] 

Hudson commented on YARN-3823:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3695) ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.


[ 
https://issues.apache.org/jira/browse/YARN-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610352#comment-14610352
 ] 

Hudson commented on YARN-3695:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network 
exception. Contributed by Raju Bairishetti (jianhe: rev 
62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java


 ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.
 --

 Key: YARN-3695
 URL: https://issues.apache.org/jira/browse/YARN-3695
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Junping Du
Assignee: Raju Bairishetti
 Fix For: 2.8.0

 Attachments: YARN-3695.01.patch, YARN-3695.patch


 YARN-3646 fix the retry forever policy in RMProxy that it only applies on 
 limited exceptions rather than all exceptions. Here, we may need the same fix 
 for ServerProxy (NMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610350#comment-14610350
 ] 

Hudson commented on YARN-3827:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3849) Too much of preemption activity causing continuos killing of containers across queues

2015-07-01 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610250#comment-14610250
 ] 

Sunil G commented on YARN-3849:
---

Test case failures are not related to this patch. 
TestNodeLabelContainerAllocation is passing locally in trunk.

 Too much of preemption activity causing continuos killing of containers 
 across queues
 -

 Key: YARN-3849
 URL: https://issues.apache.org/jira/browse/YARN-3849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3849.patch, 0002-YARN-3849.patch


 Two queues are used. Each queue has given a capacity of 0.5. Dominant 
 Resource policy is used.
 1. An app is submitted in QueueA which is consuming full cluster capacity
 2. After submitting an app in QueueB, there are some demand  and invoking 
 preemption in QueueA
 3. Instead of killing the excess of 0.5 guaranteed capacity, we observed that 
 all containers other than AM is getting killed in QueueA
 4. Now the app in QueueB is trying to take over cluster with the current free 
 space. But there are some updated demand from the app in QueueA which lost 
 its containers earlier, and preemption is kicked in QueueB now.
 Scenario in step 3 and 4 continuously happening in loop. Thus none of the 
 apps are completing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610331#comment-14610331
 ] 

Hudson commented on YARN-3768:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


 [ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3830:

Hadoop Flags: Reviewed

+1, latest patch looks good to me, will commit it shortly.

 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: nijel
Assignee: nijel
 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610314#comment-14610314
 ] 

Hudson commented on YARN-3823:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3695) ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.


[ 
https://issues.apache.org/jira/browse/YARN-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610310#comment-14610310
 ] 

Hudson commented on YARN-3695:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network 
exception. Contributed by Raju Bairishetti (jianhe: rev 
62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java


 ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.
 --

 Key: YARN-3695
 URL: https://issues.apache.org/jira/browse/YARN-3695
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Junping Du
Assignee: Raju Bairishetti
 Fix For: 2.8.0

 Attachments: YARN-3695.01.patch, YARN-3695.patch


 YARN-3646 fix the retry forever policy in RMProxy that it only applies on 
 limited exceptions rather than all exceptions. Here, we may need the same fix 
 for ServerProxy (NMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610308#comment-14610308
 ] 

Hudson commented on YARN-3827:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3770) SerializedException should also handle java.lang.Error


[ 
https://issues.apache.org/jira/browse/YARN-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610320#comment-14610320
 ] 

Hudson commented on YARN-3770:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
YARN-3770. SerializedException should also handle java.lang.Error on 
de-serialization. Contributed by Lavkesh Lahngir (jianhe: rev 
4672315e2d6abe1cee0210cf7d3e8ab114ba933c)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/SerializedExceptionPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/impl/pb/TestSerializedExceptionPBImpl.java


 SerializedException should also handle java.lang.Error 
 ---

 Key: YARN-3770
 URL: https://issues.apache.org/jira/browse/YARN-3770
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Fix For: 2.8.0

 Attachments: YARN-3770.1.patch, YARN-3770.patch


 IN SerializedExceptionPBImpl deserialize() method
 {code}
 Class classType = null;
 if (YarnException.class.isAssignableFrom(realClass)) {
   classType = YarnException.class;
 } else if (IOException.class.isAssignableFrom(realClass)) {
   classType = IOException.class;
 } else if (RuntimeException.class.isAssignableFrom(realClass)) {
   classType = RuntimeException.class;
 } else {
   classType = Exception.class;
 }
 return instantiateException(realClass.asSubclass(classType), getMessage(),
   cause == null ? null : cause.deSerialize());
   }
 {code}
 if realClass is a subclass of java.lang.Error deSerialize() throws 
 ClassCastException.
 in the last else statement classType should be equal to Trowable.class 
 instead of Exception.class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3770) SerializedException should also handle java.lang.Error


[ 
https://issues.apache.org/jira/browse/YARN-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610362#comment-14610362
 ] 

Hudson commented on YARN-3770:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
YARN-3770. SerializedException should also handle java.lang.Error on 
de-serialization. Contributed by Lavkesh Lahngir (jianhe: rev 
4672315e2d6abe1cee0210cf7d3e8ab114ba933c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/SerializedExceptionPBImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/impl/pb/TestSerializedExceptionPBImpl.java


 SerializedException should also handle java.lang.Error 
 ---

 Key: YARN-3770
 URL: https://issues.apache.org/jira/browse/YARN-3770
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Fix For: 2.8.0

 Attachments: YARN-3770.1.patch, YARN-3770.patch


 IN SerializedExceptionPBImpl deserialize() method
 {code}
 Class classType = null;
 if (YarnException.class.isAssignableFrom(realClass)) {
   classType = YarnException.class;
 } else if (IOException.class.isAssignableFrom(realClass)) {
   classType = IOException.class;
 } else if (RuntimeException.class.isAssignableFrom(realClass)) {
   classType = RuntimeException.class;
 } else {
   classType = Exception.class;
 }
 return instantiateException(realClass.asSubclass(classType), getMessage(),
   cause == null ? null : cause.deSerialize());
   }
 {code}
 if realClass is a subclass of java.lang.Error deSerialize() throws 
 ClassCastException.
 in the last else statement classType should be equal to Trowable.class 
 instead of Exception.class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3844) Make hadoop-yarn-project Native code -Wall-clean

2015-07-01 Thread Alan Burlison (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison updated YARN-3844:

Attachment: (was: YARN-3844.004.patch)

 Make hadoop-yarn-project Native code -Wall-clean
 

 Key: YARN-3844
 URL: https://issues.apache.org/jira/browse/YARN-3844
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
 Environment: As we specify -Wall as a default compilation flag, it 
 would be helpful if the Native code was -Wall-clean
Reporter: Alan Burlison
Assignee: Alan Burlison
 Attachments: YARN-3844.001.patch, YARN-3844.002.patch


 As we specify -Wall as a default compilation flag, it would be helpful if the 
 Native code was -Wall-clean



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3844) Make hadoop-yarn-project Native code -Wall-clean

2015-07-01 Thread Alan Burlison (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison updated YARN-3844:

Attachment: YARN-3844.005.patch

Updated patch with ILP32/LP64 independent casts  printf formats

 Make hadoop-yarn-project Native code -Wall-clean
 

 Key: YARN-3844
 URL: https://issues.apache.org/jira/browse/YARN-3844
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
 Environment: As we specify -Wall as a default compilation flag, it 
 would be helpful if the Native code was -Wall-clean
Reporter: Alan Burlison
Assignee: Alan Burlison
 Attachments: YARN-3844.001.patch, YARN-3844.002.patch, 
 YARN-3844.005.patch


 As we specify -Wall as a default compilation flag, it would be helpful if the 
 Native code was -Wall-clean



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID


[ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610269#comment-14610269
 ] 

Hadoop QA commented on YARN-3875:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 12s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  51m 16s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  91m 18s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743057/0002-YARN-3875.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7405c59 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8406/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8406/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8406/console |


This message was automatically generated.

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch, 0002-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID


 [ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3875:

Target Version/s: 2.8.0

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch, 0002-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID


[ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610296#comment-14610296
 ] 

Devaraj K commented on YARN-3875:
-

The failed test doesn't seem to be related to the patch.

+1 for the trivial change.

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch, 0002-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3869) Add app name to RM audit log


[ 
https://issues.apache.org/jira/browse/YARN-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609812#comment-14609812
 ] 

nijel commented on YARN-3869:
-

hi [~roji]
i would like to work on this improvement.
Please let me know is you already started the work

 Add app name to RM audit log
 

 Key: YARN-3869
 URL: https://issues.apache.org/jira/browse/YARN-3869
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Shay Rojansky
Priority: Minor

 The YARN resource manager audit log currently includes useful info such as 
 APPID, USER, etc. One crucial piece of information missing is the 
 user-supplied application name.
 Users are familiar with their application name as shown in the YARN UI, etc. 
 It's vital for something like logstash to be able to associated logs with the 
 application name for later searching in something like kibana.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3869) Add app name to RM audit log


 [ 
https://issues.apache.org/jira/browse/YARN-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nijel reassigned YARN-3869:
---

Assignee: nijel

 Add app name to RM audit log
 

 Key: YARN-3869
 URL: https://issues.apache.org/jira/browse/YARN-3869
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Shay Rojansky
Assignee: nijel
Priority: Minor

 The YARN resource manager audit log currently includes useful info such as 
 APPID, USER, etc. One crucial piece of information missing is the 
 user-supplied application name.
 Users are familiar with their application name as shown in the YARN UI, etc. 
 It's vital for something like logstash to be able to associate logs with the 
 application name for later searching in something like kibana.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3846) RM Web UI queue filter is not working

[
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609937#comment-14609937
]

Mohammad Shahid Khan commented on YARN-3846:

you are right 2.7 is not having the issue.
My mistake i have raised in the wrong branch.
Affected version should be changed.

The issue is there in trunk.
The trunk is having the Queue: change. This change only has induced the
filter issue and even you only have handled
https://issues.apache.org/jira/browse/YARN-3707.
But if we have to keep the lbel Queue: the current issue should be fixed in
the trunk.

RM Web UI queue filter is not working
-

Key: YARN-3846
URL: https://issues.apache.org/jira/browse/YARN-3846
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Affects Versions: 2.7.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
Attachments: scheduler queue issue.png, scheduler queue positive
behavior.png

Click on root queue will show the complete applications
But click on the leaf queue is not filtering the application related to the
the clicked queue.
The regular expression seems to be wrong
{code}
q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
{code}
For example
1. Suppose queue name is b
them the above expression will try to substr at index 1
q.lastIndexOf(':') = -1
-1+2= 1
which is wrong. its should look at the 0 index.
2. if queue name is ab.x
then it will parse it to .x
but it should be x

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3869) Add app name to RM audit log

2015-07-01 Thread Shay Rojansky (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shay Rojansky updated YARN-3869:

Description:
The YARN resource manager audit log currently includes useful info such as
APPID, USER, etc. One crucial piece of information missing is the user-supplied
application name.

Users are familiar with their application name as shown in the YARN UI, etc.
It's vital for something like logstash to be able to associate logs with the
application name for later searching in something like kibana.

was:
The YARN resource manager audit log currently includes useful info such as
APPID, USER, etc. One crucial piece of information missing is the user-supplied
application name.

Users are familiar with their application name as shown in the YARN UI, etc.
It's vital for something like logstash to be able to associated logs with the
application name for later searching in something like kibana.

Add app name to RM audit log

Key: YARN-3869
URL: https://issues.apache.org/jira/browse/YARN-3869
Project: Hadoop YARN
Issue Type: Improvement
Reporter: Shay Rojansky
Priority: Minor

The YARN resource manager audit log currently includes useful info such as
APPID, USER, etc. One crucial piece of information missing is the
user-supplied application name.
Users are familiar with their application name as shown in the YARN UI, etc.
It's vital for something like logstash to be able to associate logs with the
application name for later searching in something like kibana.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610001#comment-14610001
 ] 

Hudson commented on YARN-3827:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #245 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/245/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610008#comment-14610008
 ] 

Hudson commented on YARN-3823:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #245 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/245/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610006#comment-14610006
 ] 

Hudson commented on YARN-3768:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #245 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/245/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* hadoop-yarn-project/CHANGES.txt


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)


 [ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated YARN-3840:
---
Attachment: YARN-3840-2.patch

 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: LINTE
Assignee: Mohammad Shahid Khan
 Attachments: RMApps.png, YARN-3840-1.patch, YARN-3840-2.patch


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7

2015-07-01 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609883#comment-14609883
 ] 

Varun Vasudev commented on YARN-2194:
-

+1 for the latest patch. Tested it on my machine and it handles the comma 
issue. I'll commit it tomorrow if there are no objections.

 Cgroups cease to work in RHEL7
 --

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Critical
 Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch, 
 YARN-2194-4.patch, YARN-2194-5.patch, YARN-2194-6.patch


 In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the 
 controller name leads to container launch failure. 
 RHEL7 deprecates libcgroup and recommends the user of systemd. However, 
 systemd has certain shortcomings as identified in this JIRA (see comments). 
 This JIRA only fixes the failure, and doesn't try to use systemd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)


[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609944#comment-14609944
 ] 

Hadoop QA commented on YARN-3840:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 20s | Pre-patch trunk has 3 extant 
Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | @author |   0m  0s | The patch appears to contain 1 
@author tags which the Hadoop  community has agreed to not allow in code 
contributions. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 48s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 17s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 36s | The applied patch generated  1 
new checkstyle issues (total was 8, now 9). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 41s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m 10s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   6m  4s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  58m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743037/YARN-3840-2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7405c59 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/trunkFindbugsWarningshadoop-yarn-server-common.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8404/console |


This message was automatically generated.

 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: LINTE
Assignee: Mohammad Shahid Khan
 Attachments: RMApps.png, YARN-3840-1.patch, YARN-3840-2.patch


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3846) RM Web UI queue filter is not working


 [ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated YARN-3846:
---
Affects Version/s: (was: 2.7.0)
   2.8.0
   3.0.0

 RM Web UI queue filter is not working
 -

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0, 2.8.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
 Attachments: scheduler queue issue.png, scheduler queue positive 
 behavior.png


 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


 [ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3830:

Target Version/s: 2.8.0
 Component/s: scheduler

 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: nijel
Assignee: nijel
 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID


[ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609913#comment-14609913
 ] 

Hadoop QA commented on YARN-3875:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 44s | The applied patch generated  2 
new checkstyle issues (total was 2, now 4). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 49s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 46s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743029/0001-YARN-3875.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7405c59 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8402/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8402/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8402/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8402/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8402/console |


This message was automatically generated.

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3875) FSSchedulerNode#reserveResource() not printing applicationID

2015-07-01 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3875:
---
Attachment: 0002-YARN-3875.patch

Update patch after formating

 FSSchedulerNode#reserveResource() not printing applicationID
 

 Key: YARN-3875
 URL: https://issues.apache.org/jira/browse/YARN-3875
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3875.patch, 0002-YARN-3875.patch


 FSSchedulerNode#reserveResource()
 {code}
   LOG.info(Updated reserved container  + 
   container.getContainer().getId() +  on node  + 
   this +  for application  + application);
 } else {
   LOG.info(Reserved container  + container.getContainer().getId() + 
on node  + this +  for application  + application);
 }
 {code}
 update to application.getApplicationId() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3528) Tests with 12345 as hard-coded port break jenkins


[ 
https://issues.apache.org/jira/browse/YARN-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610034#comment-14610034
 ] 

Varun Saxena commented on YARN-3528:


[~brahmareddy], kindly use spaces instead of tabs

 Tests with 12345 as hard-coded port break jenkins
 -

 Key: YARN-3528
 URL: https://issues.apache.org/jira/browse/YARN-3528
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: ASF Jenkins
Reporter: Steve Loughran
Assignee: Brahma Reddy Battula
Priority: Blocker
  Labels: test
 Attachments: YARN-3528-002.patch, YARN-3528.patch


 A lot of the YARN tests have hard-coded the port 12345 for their services to 
 come up on.
 This makes it impossible to have scheduled or precommit tests to run 
 consistently on the ASF jenkins hosts. Instead the tests fail regularly and 
 appear to get ignored completely.
 A quick grep of 12345 shows up many places in the test suite where this 
 practise has developed.
 * All {{BaseContainerManagerTest}} subclasses
 * {{TestNodeManagerShutdown}}
 * {{TestContainerManager}}
 + others
 This needs to be addressed through portscanning and dynamic port allocation. 
 Please can someone do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3876) get_executable() assumes everything is Linux

2015-07-01 Thread Alan Burlison (JIRA)

Alan Burlison created YARN-3876:
---

 Summary: get_executable() assumes everything is Linux
 Key: YARN-3876
 URL: https://issues.apache.org/jira/browse/YARN-3876
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: Alan Burlison


get_executable() in container-executor.c is non-portable and is hard-coded to 
assume Linux's /proc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3846) RM Web UI queue filter is not working


[ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609939#comment-14609939
 ] 

Mohammad Shahid Khan commented on YARN-3846:


also  in  branch-2 

 RM Web UI queue filter is not working
 -

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
 Attachments: scheduler queue issue.png, scheduler queue positive 
 behavior.png


 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3846) RM Web UI queue filter is not working


 [ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated YARN-3846:
---
Target Version/s: 3.0.0, 2.8.0  (was: 2.7.2)

 RM Web UI queue filter is not working
 -

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0, 2.8.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
 Attachments: scheduler queue issue.png, scheduler queue positive 
 behavior.png


 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3846) RM Web UI queue filter is not working


[ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609946#comment-14609946
 ] 

Mohammad Shahid Khan commented on YARN-3846:


have changed the affected and target version.

 RM Web UI queue filter is not working
 -

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0, 2.8.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
 Attachments: scheduler queue issue.png, scheduler queue positive 
 behavior.png


 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3849) Too much of preemption activity causing continuos killing of containers across queues

2015-07-01 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3849:
--
Attachment: 0002-YARN-3849.patch

Thank you [~leftnoteasy] for the comments.

I have uploaded a patch by addressing the comments. Kindly check.

 Too much of preemption activity causing continuos killing of containers 
 across queues
 -

 Key: YARN-3849
 URL: https://issues.apache.org/jira/browse/YARN-3849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3849.patch, 0002-YARN-3849.patch


 Two queues are used. Each queue has given a capacity of 0.5. Dominant 
 Resource policy is used.
 1. An app is submitted in QueueA which is consuming full cluster capacity
 2. After submitting an app in QueueB, there are some demand  and invoking 
 preemption in QueueA
 3. Instead of killing the excess of 0.5 guaranteed capacity, we observed that 
 all containers other than AM is getting killed in QueueA
 4. Now the app in QueueB is trying to take over cluster with the current free 
 space. But there are some updated demand from the app in QueueA which lost 
 its containers earlier, and preemption is kicked in QueueB now.
 Scenario in step 3 and 4 continuously happening in loop. Thus none of the 
 apps are completing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)


[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609947#comment-14609947
 ] 

Devaraj K commented on YARN-3840:
-

You can probably package the plugin js file which is compatible for the 
dt-1.9.4 release(currently YARN uses) similar to the one we have already in 
hadoop-yarn-project\hadoop-yarn\hadoop-yarn-common\src\main\resources\webapps\static\dt-1.9.4\js\jquery.dataTables.min.js.gz.

 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: LINTE
Assignee: Mohammad Shahid Khan
 Attachments: RMApps.png, YARN-3840-1.patch, YARN-3840-2.patch


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3793) Several NPEs when deleting local files on NM recovery

2015-07-01 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610401#comment-14610401
 ] 

Jason Lowe commented on YARN-3793:
--

+1 lgtm.  Will commit later today if no objections.

 Several NPEs when deleting local files on NM recovery
 -

 Key: YARN-3793
 URL: https://issues.apache.org/jira/browse/YARN-3793
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Varun Saxena
 Attachments: YARN-3793.01.patch, YARN-3793.02.patch


 When NM work-preserving restart is enabled, we see several NPEs on recovery. 
 These seem to correspond to sub-directories that need to be deleted. I wonder 
 if null pointers here mean incorrect tracking of these resources and a 
 potential leak. This JIRA is to investigate and fix anything required.
 Logs show:
 {noformat}
 2015-05-18 07:06:10,225 INFO 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting 
 absolute path : null
 2015-05-18 07:06:10,224 ERROR 
 org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during 
 execution of task in DeletionService
 java.lang.NullPointerException
 at 
 org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274)
 at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: (was: yarn-site.xml.example)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.2

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, Traffic Control 
 Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept:
 http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3768) ArrayIndexOutOfBoundsException with empty environment variables


[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610538#comment-14610538
 ] 

Hudson commented on YARN-3768:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2191 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2191/])
YARN-3768. ArrayIndexOutOfBoundsException with empty environment variables. 
(Zhihai Xu via gera) (gera: rev 6f2a41e37d0b36cdafcfff75125165f212c612a6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApps.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java


 ArrayIndexOutOfBoundsException with empty environment variables
 ---

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch, YARN-3768.003.patch, YARN-3768.004.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 {code}
 java.lang.ArrayIndexOutOfBoundsException: 1
   at org.apache.hadoop.yarn.util.Apps.setEnvFromInputString(Apps.java:80)
 {code}
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3827) Migrate YARN native build to new CMake framework


[ 
https://issues.apache.org/jira/browse/YARN-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610534#comment-14610534
 ] 

Hudson commented on YARN-3827:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2191 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2191/])
YARN-3827. Migrate YARN native build to new CMake framework (Alan Burlison via 
Colin P. McCabe) (cmccabe: rev d0cc0380b57db5fdeb41775bb9ca42dac65928b8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt
* hadoop-yarn-project/CHANGES.txt


 Migrate YARN native build to new CMake framework
 

 Key: YARN-3827
 URL: https://issues.apache.org/jira/browse/YARN-3827
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
Reporter: Alan Burlison
Assignee: Alan Burlison
 Fix For: 2.8.0

 Attachments: YARN-3827.001.patch


 As per HADOOP-12036, the CMake infrastructure should be refactored and made 
 common across all Hadoop components. This bug covers the migration of YARN to 
 the new CMake infrastructure. This change will also add support for building 
 YARN Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610540#comment-14610540
 ] 

Hudson commented on YARN-3823:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2191 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2191/])
YARN-3823. Fix mismatch in default values for (devaraj: rev 
7405c59799ed1b8ad1a7c6f1b18fabf49d0b92b2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


[ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610394#comment-14610394
 ] 

Hudson commented on YARN-3830:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #8105 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8105/])
YARN-3830. AbstractYarnScheduler.createReleaseCache may try to clean a 
(devaraj: rev 80a68d60560e505b5f8e01969dc3c168a1e5a7f3)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java


 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: nijel
Assignee: nijel
 Fix For: 2.8.0

 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7

2015-07-01 Thread Wei Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610475#comment-14610475
 ] 

Wei Yan commented on YARN-2194:
---

Thanks, [~vvasudev].

 Cgroups cease to work in RHEL7
 --

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Critical
 Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch, 
 YARN-2194-4.patch, YARN-2194-5.patch, YARN-2194-6.patch


 In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the 
 controller name leads to container launch failure. 
 RHEL7 deprecates libcgroup and recommends the user of systemd. However, 
 systemd has certain shortcomings as identified in this JIRA (see comments). 
 This JIRA only fixes the failure, and doesn't try to use systemd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3844) Make hadoop-yarn-project Native code -Wall-clean


[ 
https://issues.apache.org/jira/browse/YARN-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610546#comment-14610546
 ] 

Hadoop QA commented on YARN-3844:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 13s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | yarn tests |   6m 20s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  21m 34s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743073/YARN-3844.005.patch |
| Optional Tests | javac unit |
| git revision | trunk / 80a68d6 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8407/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8407/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8407/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8407/console |


This message was automatically generated.

 Make hadoop-yarn-project Native code -Wall-clean
 

 Key: YARN-3844
 URL: https://issues.apache.org/jira/browse/YARN-3844
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: build
Affects Versions: 2.7.0
 Environment: As we specify -Wall as a default compilation flag, it 
 would be helpful if the Native code was -Wall-clean
Reporter: Alan Burlison
Assignee: Alan Burlison
 Attachments: YARN-3844.001.patch, YARN-3844.002.patch, 
 YARN-3844.005.patch


 As we specify -Wall as a default compilation flag, it would be helpful if the 
 Native code was -Wall-clean



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Fix Version/s: 2.7.2
  Component/s: (was: capacityscheduler)
   (was: resourcemanager)

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.2

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, Traffic Control 
 Design.png, yarn-site.xml.example


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept:
 http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)


 [ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated YARN-3840:
---
Attachment: YARN-3840-3.patch

 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: LINTE
Assignee: Mohammad Shahid Khan
 Attachments: RMApps.png, YARN-3840-1.patch, YARN-3840-2.patch, 
 YARN-3840-3.patch


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-07-01 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610825#comment-14610825
 ] 

Sidharta Seethana commented on YARN-2140:
-

Hi [~dheeren] , we only address network bandwidth resource isolation in the 
design doc that is attached, not isolating the network stack itself. I 
recommend taking a look at YARN-3611 for new docker related functionality and 
please file a JIRA with requirements that you have.

 Add support for network IO isolation/scheduling for containers
 --

 Key: YARN-2140
 URL: https://issues.apache.org/jira/browse/YARN-2140
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
Assignee: Sidharta Seethana
 Attachments: NetworkAsAResourceDesign.pdf






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-313) Add Admin API for supporting node resource configuration in command line

2015-07-01 Thread Inigo Goiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-313:
-
Attachment: YARN-313-v4.patch

Updating v3 to latest trunk

 Add Admin API for supporting node resource configuration in command line
 

 Key: YARN-313
 URL: https://issues.apache.org/jira/browse/YARN-313
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: YARN-313-sample.patch, YARN-313-v1.patch, 
 YARN-313-v2.patch, YARN-313-v3.patch, YARN-313-v4.patch


 We should provide some admin interface, e.g. yarn rmadmin -refreshResources 
 to support changes of node's resource specified in a config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2004) Priority scheduling support in Capacity scheduler

2015-07-01 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611034#comment-14611034
]

Jian He commented on YARN-2004:
---

- authenticateApplicationPriority : IIUC, all it does is just to take the
config from yarn-site.xml (not capacity-scheduler.xml) and check the priority
against that. I don't see much need of explicitly exposing an API in scheduler
and inject the check there. Or this method has more responsibility than that ?

- Given that YARN-2003 is just the API of YARN-2004 and we anyways have to
review the two altogether, we may merge the two into a single patch ? This is
easier for review and you also do not need to split the patch and upload in two
different places. And you can actually split the part about updating
application priority at runtime and state store changes into a different patch.

Priority scheduling support in Capacity scheduler
-

1.Check for Application priority. If priority is available, then return
the highest priority job.
2.Otherwise continue with existing logic such as App ID comparison and
then TimeStamp comparison.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


[ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611059#comment-14611059
 ] 

Hadoop QA commented on YARN-2681:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  24m 25s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 12 new or modified test files. |
| {color:green}+1{color} | javac |  10m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  13m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 32s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m  2s | The applied patch generated  1 
new checkstyle issues (total was 221, now 221). |
| {color:green}+1{color} | whitespace |   0m 33s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 11s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |  13m 54s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |  11m 32s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | mapreduce tests |   2m 39s | Tests passed in 
hadoop-mapreduce-client-core. |
| {color:green}+1{color} | yarn tests |   0m 34s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m 32s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |   6m  5s | Tests failed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  60m 12s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 155m 25s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
|   | hadoop.yarn.server.nodemanager.TestEventFlow |
|   | 
hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
 |
|   | 
hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor |
|   | 
hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch |
|   | hadoop.yarn.server.nodemanager.containermanager.TestNMProxy |
|   | hadoop.yarn.server.resourcemanager.TestAppManager |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched |
| Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart 
|
|   | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743129/YARN-2681.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eac1d18 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/trunkFindbugsWarningshadoop-mapreduce-client-app.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-mapreduce-client-core test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8411/console |


This message was automatically generated.

 Support bandwidth enforcement for containers while reading

[jira] [Commented] (YARN-313) Add Admin API for supporting node resource configuration in command line


[ 
https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611046#comment-14611046
 ] 

Hadoop QA commented on YARN-313:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 40s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:red}-1{color} | javac |   2m 20s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743143/YARN-313-v4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b5cdf78 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8413/console |


This message was automatically generated.

 Add Admin API for supporting node resource configuration in command line
 

 Key: YARN-313
 URL: https://issues.apache.org/jira/browse/YARN-313
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Junping Du
Assignee: Junping Du
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: YARN-313-sample.patch, YARN-313-v1.patch, 
 YARN-313-v2.patch, YARN-313-v3.patch, YARN-313-v4.patch


 We should provide some admin interface, e.g. yarn rmadmin -refreshResources 
 to support changes of node's resource specified in a config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt


[ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610576#comment-14610576
 ] 

Hudson commented on YARN-3830:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #243 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/243/])
YARN-3830. AbstractYarnScheduler.createReleaseCache may try to clean a 
(devaraj: rev 80a68d60560e505b5f8e01969dc3c168a1e5a7f3)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java


 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: nijel
Assignee: nijel
 Fix For: 2.8.0

 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch, 
 YARN-3830_4.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3841) [Storage implementation] Create HDFS backing storage implementation for ATS writes


[ 
https://issues.apache.org/jira/browse/YARN-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610614#comment-14610614
 ] 

Hadoop QA commented on YARN-3841:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743115/YARN-3841.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 2ac87df |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8408/console |


This message was automatically generated.

 [Storage implementation] Create HDFS backing storage implementation for ATS 
 writes
 --

 Key: YARN-3841
 URL: https://issues.apache.org/jira/browse/YARN-3841
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
 Attachments: YARN-3841.001.patch


 HDFS backing storage is useful for following scenarios.
 1. For Hadoop clusters which don't run HBase.
 2. For fallback from HBase when HBase cluster is temporary unavailable. 
 Quoting ATS design document of YARN-2928:
 {quote}
 In the case the HBase
 storage is not available, the plugin should buffer the writes temporarily 
 (e.g. HDFS), and flush
 them once the storage comes back online. Reading and writing to hdfs as the 
 the backup storage
 could potentially use the HDFS writer plugin unless the complexity of 
 generalizing the HDFS
 writer plugin for this purpose exceeds the benefits of reusing it here.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


[ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610644#comment-14610644
 ] 

cntic commented on YARN-2681:
-

As the side doesn't allow attaching pdf files, the development guide of this 
feature can be found in the following link: 
http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HDFS-2681.02.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


[ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610643#comment-14610643
 ] 

cntic commented on YARN-2681:
-

As the side doesn't allow attaching pdf files, the development guide of this 
feature can be found in the following link: 
http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HDFS-2681.02.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf
 Implementation: http://www.hit.bme.hu/~dohoai/documents/HdfsTrafficControl.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3823) Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property

2015-07-01 Thread Ray Chiang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610674#comment-14610674
 ] 

Ray Chiang commented on YARN-3823:
--

Thanks for the review and commit!

 Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores 
 property
 

 Key: YARN-3823
 URL: https://issues.apache.org/jira/browse/YARN-3823
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Fix For: 2.8.0

 Attachments: YARN-3823.001.patch, YARN-3823.002.patch


 In yarn-default.xml, the property is defined as:
   XML Property: yarn.scheduler.maximum-allocation-vcores
   XML Value: 32
 In YarnConfiguration.java the corresponding member variable is defined as:
   Config Name: DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
   Config Value: 4
 The Config value comes from YARN-193 and the default xml property comes from 
 YARN-2. Should we keep it this way or should one of the values get updated?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3877) YarnClientImpl.submitApplication swallows exceptions

2015-07-01 Thread Steve Loughran (JIRA)

Steve Loughran created YARN-3877:


 Summary: YarnClientImpl.submitApplication swallows exceptions
 Key: YARN-3877
 URL: https://issues.apache.org/jira/browse/YARN-3877
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 2.7.2
Reporter: Steve Loughran
Priority: Minor


When {{YarnClientImpl.submitApplication}} spins waiting for the application to 
be accepted, any interruption during its Sleep() calls are logged and swallowed.

this makes it hard to interrupt the thread during shutdown. Really it should 
throw some form of exception and let the caller deal with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-07-01 Thread Dheeren Beborrtha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610649#comment-14610649
 ] 

Dheeren Beborrtha commented on YARN-2140:
-

How do you support port level isolation for Docker containers? 
For example, lets say I would like to run multiple docker containers on the 
same Datanode. If each of the conatiners needs to be long running and need to 
advertise their ports, what is the mechanism for doing so? 

 Add support for network IO isolation/scheduling for containers
 --

 Key: YARN-2140
 URL: https://issues.apache.org/jira/browse/YARN-2140
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
Assignee: Sidharta Seethana
 Attachments: NetworkAsAResourceDesign.pdf






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7

2015-07-01 Thread Karthik Kambatla (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610725#comment-14610725
]

Karthik Kambatla commented on YARN-2194:

I tried the latest patch, and still run into the same issue (logs below). Did
anyone try the patch with multiple local directories?

{noformat}
15/07/01 10:51:32 INFO mapreduce.Job: Job job_1435771879097_0003 failed with
state FAILED due to: Application application_1435771879097_0003 failed 2 times
due to AM Container for appattempt_1435771879097_0003_02 exited with
exitCode: -1000
For more detailed output, check application tracking
page:http://krhel7-1.vpc.cloudera.com:8088/proxy/application_1435771879097_0003/Then,
click on links to logs of each attempt.
Diagnostics: Application application_1435771879097_0003 initialization failed
(exitCode=20) with output: main : command provided 0
main : user is nobody
main : requested yarn user is systest
Failed to create directory /data/yarn/nm%/data1/yarn/nm/usercache/systest - No
such file or directory

Failing this attempt. Failing the application.
{noformat}

Cgroups cease to work in RHEL7
--

Key: YARN-2194
URL: https://issues.apache.org/jira/browse/YARN-2194
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.7.0
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Critical
Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch,
YARN-2194-4.patch, YARN-2194-5.patch, YARN-2194-6.patch

In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the
controller name leads to container launch failure.
RHEL7 deprecates libcgroup and recommends the user of systemd. However,
systemd has certain shortcomings as identified in this JIRA (see comments).
This JIRA only fixes the failure, and doesn't try to use systemd.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS


 [ 
https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cntic updated YARN-2681:

Attachment: HdfsTrafficControl_UML.png
HDFS-2681.02.patch

This patch delivers full features of YARN-2681

 Support bandwidth enforcement for containers while reading from HDFS
 

 Key: YARN-2681
 URL: https://issues.apache.org/jira/browse/YARN-2681
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: Linux
Reporter: cntic
  Labels: BB2015-05-TBR
 Fix For: 2.7.0

 Attachments: HADOOP-2681.patch, HADOOP-2681.patch, 
 HDFS-2681.02.patch, HdfsTrafficControl_UML.png, Traffic Control Design.png


 To read/write data from HDFS on data node, applications establise TCP/IP 
 connections with the datanode. The HDFS read can be controled by setting 
 Linux Traffic Control  (TC) subsystem on the data node to make filters on 
 appropriate connections.
 The current cgroups net_cls concept can not be applied on the node where the 
 container is launched, netheir on data node since:
 -   TC hanldes outgoing bandwidth only, so it can be set on container node 
 (HDFS read = incoming data for the container)
 -   Since HDFS data node is handled by only one process,  it is not possible 
 to use net_cls to separate connections from different containers to the 
 datanode.
 Tasks:
 1) Extend Resource model to define bandwidth enforcement rate
 2) Monitor TCP/IP connection estabilised by container handling process and 
 its child processes
 3) Set Linux Traffic Control rules on data node base on address:port pairs in 
 order to enforce bandwidth of outgoing data
 Concept:
 http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS