[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425430#comment-15425430
 ] 

Karthik Kambatla commented on YARN-5479:


I expect some performance improvements come by way of global scheduling 
(YARN-5139), where we are considering using threadpools for some parallelizable 
operations. 

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-15 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422033#comment-15422033
 ] 

He Tianyi commented on YARN-5479:
-

Yes. And It's a vast improvement. 
I simulated a scenario similar to production (with 10s of queues, hundres of 
running apps) and benchmark showing 2x more faster for heartbeat processing.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-15 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422002#comment-15422002
 ] 

Xianyin Xin commented on YARN-5479:
---

[~He Tianyi], hope YARN-4090 can provide some information, in which the locked 
resourceusage was snapshoted and such the performance was improved greatly.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-15 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420796#comment-15420796
 ] 

sandflee commented on YARN-5479:


will do, thx

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-14 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420549#comment-15420549
 ] 

He Tianyi commented on YARN-5479:
-

Good point [~sandflee]. Would you share some performance evaluation based on 
that? Thanks.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419963#comment-15419963
 ] 

sandflee commented on YARN-5479:


seems no need to compute minShare/isNeed/MinShareRatio/UseToWeightRatio in 
every comparator#compute, we could snapshot these before do real sort.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-10 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415975#comment-15415975
 ] 

Ray Chiang commented on YARN-5479:
--

Sorry for the delay in replying.  YARN-5047 has the lead-in discussion, but 
YARN-5283 is refactoring the container scheduling into one method as well.

And I'm fine with an umbrella JIRA.  The more we break this up into individual 
features, the easier it will be to cherry-pick and judge impact on specific 
changes.  I'd just be aware of conflicts with the entirety of the refactoring 
planned at YARN-5046.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-09 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413653#comment-15413653
 ] 

Jason Lowe commented on YARN-5479:
--

bq. While doing so does not seemly cause any problem in production (fairness is 
slightly damaged locally, but within acceptable range.

What is acceptable for your production may not be acceptable to others.  We're 
changing the requirements, and that could have ramifications for some users.  
It's hard to say, which is why I'd rather avoid going there unless absolutely 
necessary.

bq. Shall we make this issue an umbrella?

Yep, seems an appropriate place to gather performance improvements, although as 
mentioned above some of these may not be (or should not be) specific to the 
FairScheduler.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-08 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412817#comment-15412817
 ] 

He Tianyi commented on YARN-5479:
-

Thanks for comments. [~rchiang]. [~jlowe]. 

bq. I'd be careful with having multiple implementations or multiple APIs for 
doing the same thing with Resource. Resource is used a lot of places in the 
Hadoop codebase and this could add confusion, even with accurate Javadocs.
Yes, multiple implementations would be confusing. I tried to replace 
{{ResourcePBImpl}} directly with the implementation I mentioned and looks like 
no other issue is raised. Maybe we could still stick to single version of 
implementation by making it faster.

bq. The nodeUpdate() changes will conflict with YARN-5047 unless you plan on 
doing the same changes for CapacityScheduler and FifoScheduler.
Most changes can be done in {{attemptScheduling}}, which is dedicated to 
FairScheduler. So perhaps we can keep it that way.

bq. Minimally I think we should approach this as two (or more) separate JIRAs 
since there are two vastly different approaches to improving performance here.
Agreed. Will fill separate JIRAs to address each aspect.

bq. I don't think we should start loosening the guarantees of the scheduler for 
performance reasons until we've exhausted the other ways we can improve 
performance
Certainly. However, the approach would be quite simple for implementing. While 
doing so does not seemly cause any problem in production (fairness is slightly 
damaged locally, but within acceptable range. and there is no effect globally. 
though not carefully investigated yet). 
So if one must figure out how to balance between resource utilization and 
fairness (since resource costs), providing such option (e.g. through 
configuration) may be viable. 



Shall we make this issue an umbrella? There are still many approaches to 
deliver better performance in FairScheduler.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412177#comment-15412177
 ] 

Jason Lowe commented on YARN-5479:
--

Agree the proposals are interesting.  I'd love to get the overhead of Resource 
reduced, since as you and Ray point out it's used everywhere.

Minimally I think we should approach this as two (or more) separate JIRAs since 
there are two vastly different approaches to improving performance here.  One 
is optimizing the existing algorithm while the other is proposing to change the 
requirements to allow more optimization.  I don't think we should start 
loosening the guarantees of the scheduler for performance reasons until we've 
exhausted the other ways we can improve performance.  So personally I'd rather 
see the Resource-related improvements before the others that change the 
guarantees to which users have grown accustomed.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-07 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411121#comment-15411121
 ] 

Ray Chiang commented on YARN-5479:
--

What you have in mind sounds interesting.  I'd have to look at parts of the 
codebase more to comment further, but just some food for thought:

- I'd be careful with having multiple implementations or multiple APIs for 
doing the same thing with Resource.  Resource is used a lot of places in the 
Hadoop codebase and this could add confusion, even with accurate Javadocs.

- The nodeUpdate() changes will conflict with YARN-5047 unless you plan on 
doing the same changes for CapacityScheduler and FifoScheduler.


> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org