[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-31 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955762#comment-13955762
 ] 

Sandy Ryza commented on YARN-1889:
--

+1

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
  Labels: reviewed
 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-30 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954925#comment-13954925
 ] 

Hong Zhiguo commented on YARN-1889:
---

Hi, Fengdong,
I didn't submit new patch yet and I'll do it now. Sorry I don't have enough 
time to do it in last weekend.

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954931#comment-13954931
 ] 

Hadoop QA commented on YARN-1889:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12637758/YARN-1889.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3491//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3491//console

This message is automatically generated.

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent 

[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-30 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954945#comment-13954945
 ] 

Fengdong Yu commented on YARN-1889:
---

The new patch looks good to me.

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
  Labels: reviewed
 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-29 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951778#comment-13951778
 ] 

Fengdong Yu commented on YARN-1889:
---

Hi, Zhiguo,
what my comments you addressed in your new patch? I cannot see any change.

1. there are still tabs in the patch
2. move following initialization into the constructor 
{code}
+  private Priority priority = recordFactory.newRecordInstance(Priority.class);
+  private ResourceWeights resourceWeights = new ResourceWeights();
{code}
3: As Sandy said, don't use  recordFactory.newRecordInstance(Priority.class), 
instead, use Priority.newInstance(1)
4: so remove priority.setPriority(1);
{code}
   public Priority getPriority() {
 // Right now per-app priorities are not passed to scheduler,
 // so everyone has the same priority.
-Priority p = recordFactory.newRecordInstance(Priority.class);
-p.setPriority(1);
-return p;
+priority.setPriority(1);
+return priority;
   }
{code}

5: please rename to getResourceWeights()
{code}
+  public ResourceWeights getResourceWeightsObject() {
+   return resourceWeights;
+  }
{code}

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String 

[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950576#comment-13950576
 ] 

Hadoop QA commented on YARN-1889:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12637367/YARN-1889.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3483//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3483//console

This message is automatically generated.

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 

[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951002#comment-13951002
 ] 

Sandy Ryza commented on YARN-1889:
--

Another nit:
Priority.newInstance should be used instead of recordFactory

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950999#comment-13950999
 ] 

Sandy Ryza commented on YARN-1889:
--

When you say gc pressure, which is going down?  The number of gc's or the 
time spent in each gc (or both)?

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-28 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950749#comment-13950749
 ] 

Fengdong Yu commented on YARN-1889:
---

Good catch, Zhiguo.

can you add some test cases in your patch?
please replace 'tab' in your code with 'space'.

{code}
+  private Priority priority = recordFactory.newRecordInstance(Priority.class);
+  private ResourceWeights resourceWeights = new ResourceWeights();
{code}

can you add these to the constructor?

{code}
+  public ResourceWeights getResourceWeightsObject() {
+   return resourceWeights;
+  }
{code}

It would be better for the name getResourceWeights()


 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after 

[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-28 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951774#comment-13951774
 ] 

Hong Zhiguo commented on YARN-1889:
---

Hi, Sandy, 
During processing NodeUpdate events, the number of GC and the accumulated GC 
time is reduced about half.


 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
 Attachments: YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)