[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033233#comment-16033233 ] Daryn Sharp commented on YARN-6245: --- Posted to new YARN-6679 in case you wish to pursue the immutable resources, although the scheduler really should try to reuse instances when possible. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033054#comment-16033054 ] Daryn Sharp commented on YARN-6245: --- I've been OOO. I'll be posting my collection of patches today for review. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020266#comment-16020266 ] Arun Suresh commented on YARN-6245: --- [~roniburd], is this similar to what you had proposed in YARN-6418 ? bq. At least as a start, it's a very simple patch that substitutes in a lightweight object via Resource.newInstance that simply contains 2 longs. I understand you had also made some changes to the ResourceCalculator. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020182#comment-16020182 ] Wangda Tan commented on YARN-6245: -- [~daryn], Discussed with [~jlowe] offline, it looks like a great idea. It automatically use light PB while doing internal computations. bq. ... which converts the lightweight to a pb impl as required. Not sure if this convert lightweight Resource instance permanently or temporarily, it's better to optimize the case which {{ProtoUtils.convertToProtoFormat(Resource)}} invoked many times on a same Resource object reference, ideally conversion should only happen once. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019701#comment-16019701 ] Daryn Sharp commented on YARN-6245: --- [~jlowe] asked me to comment since we're running into 2.8 scheduler performance issues we believe are (in part) due to pb impl based objects. I think I've designed a means for resources via RPC to remain {{ResourcePBImpl}} while internally created resources are lightweight and only converted to a PB if it will be sent over the wire. At least as a start, it's a very simple patch that substitutes in a lightweight object via {{Resource.newInstance}} that simply contains 2 longs. Replaced usages of {{((ResourcePBImpl)r)#getProto()}} with {{ProtoUtils.convertToProtoFormat(Resource)}} which converts the lightweight to a pb impl as required. That's it. We're testing today. Will post a sample patch if it looks promising. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966166#comment-15966166 ] Roni Burd commented on YARN-6245: - Sounds good guys. I'm having some issues with the change in trunk (I was based off 2.7) and need to refactor out the 2 changes. Will post the patch ASAP > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955811#comment-15955811 ] Wangda Tan commented on YARN-6245: -- [~roniburd], for patch submission, I agree with [~kasha] to submit one patch to this one and close YARN-6418 as dup. Please feel free to take over this JIRA since I may not have bandwidth to work on it in short term. For approach suggested by [~kasha], I'm a little afraid it adds too much effort. For example, we need to clearly identify which "resource" references in scheduler need to be PBImpl, etc. And Resources/ResourceCalculator classes are used outside of YARN as well. Updating all of them to use LightResource might be problematic. I'm fine if effort of the approach is not too much. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955780#comment-15955780 ] Karthik Kambatla commented on YARN-6245: bq. It works very well so I think we may not need a huge patch to remove ResourcePBImpl from scheduler entirely. IMO, the scheduler is becoming increasingly complex. Using both the light and proto versions in the scheduler adds to this complexity. I would really like for us to avoid that. bq. So my question should I submit both changes in one patch? It might be nice to have one patch focus on replacing proto-based Resource in the scheduler with LightResource. This would include updating the calculators to use and return LightResource. Another patch could optimize for the number of LightResource objects created. Does that sound reasonable, [~roniburd]? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955593#comment-15955593 ] Roni Burd commented on YARN-6245: - I agree. The scheduler should use only LightResource only. In addition to that, I've found that even if you use LightResource the scheduler are instantiating too many objects so my change also modify ResourceCalculator (and some inherited methods) in order to reuse a LightResource (especially in methods like getUserLimitAndSetHeadroom() and other methods. So my question should I submit both changes in one patch? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954489#comment-15954489 ] Wangda Tan commented on YARN-6245: -- yeah, so basically if there's any interacts to other processes like AM/NM needed, scheduler should use ResourcePBImpl. What I have done is to do profiling and identify the most costly part. It works very well so I think we may not need a huge patch to remove ResourcePBImpl from scheduler entirely. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954477#comment-15954477 ] Karthik Kambatla commented on YARN-6245: One other thing. The scheduler should probably only use LightResource. Since the RM interacts with the AM, it can take care of translating to and from LightResource? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954320#comment-15954320 ] Wangda Tan commented on YARN-6245: -- Sounds good [~kasha]. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954300#comment-15954300 ] Karthik Kambatla commented on YARN-6245: We are in agreement. {{LightResource}} seems very useful and we should add it soon. I see use for {{ObservableResource}} in FairScheduler, but let us add it along with changes that make use of it later. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954295#comment-15954295 ] Wangda Tan commented on YARN-6245: -- [~kasha], actually I'm not sure if we need the FinalResource/ObservableResource if we already have the LightResource. Do you have specific any use case need the FinalResource/ObservationResource? I personally prefer to only add LightResource. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954232#comment-15954232 ] Karthik Kambatla commented on YARN-6245: I think I understand some more now. :) Now, I see how the LightResource helps with reducing the overhead of instantiating Resource by doing away with protobuf. And, it should address the requirement in the JIRA title and description? Where would we need FinalResource? IMO, we need it in cases where we return a Resource but don't expect the caller to modify the returned Resource object. Are there other cases? The difference between the FinalResource and ObservableResource is: the former holds a snapshot of whatever value we are returning and the latter is just an immutable pointer to a Resource object and shows the latest value. Both could be useful, and both could be light. Do we need either or both of them? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954180#comment-15954180 ] Wangda Tan commented on YARN-6245: -- bq. I see the difference between LightResource and FinalResource and the need for both of them. Is the latter an immutable version of the former? Yes, you can think the FinalResource is an immutable version and backed by LightResource. bq. I am not sure I fully understand. Mind posting some pseudo code for how this operation should look with the API you have in mind? Sure, in my example: {{ d = (res_a * float_x + res_b) / res_c}}. If all resources are immutable, computation becomes: {code} # Please note that immutable_{op} means do the operation and return # an immutable object. temp_1 = immutable_multiply(res_a, float_x); temp_2 = immutable_add(temp_1 + res_b); d = divide(temp_2, res_c); {code} It creates 2 temp instances during the computation. And if we can have {{LightResource}}, the computation becomes: {code} light_1 = light_multiply(res_a, float_x); light_add_to(light_1, res_b); d = divide(light_1, res_c); {code} Which only creates 1 temp instance. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954014#comment-15954014 ] Karthik Kambatla commented on YARN-6245: [~leftnoteasy] - I see the difference between LightResource and FinalResource and the need for both of them. The latter bq. solve the multi-ops calculation, like d = (res_a * float_x + res_b) / res_c. During the calculation, we need to create some temporary resource instances. A observable copy is not enough because we need to write resource values of these temp resource instances. I am not sure I fully understand. Mind posting some pseudo code for how this operation *should* look with the API you have in mind? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953885#comment-15953885 ] Wangda Tan commented on YARN-6245: -- [~roniburd], All your suggestions make sense to me. In addition to that, If the new {{LightResource}} is targeted to solve the *internal* computation overhead, I think we can safely assume it will be used in a single-thread environment and no addition locks needed. Which can reduce even more overhead and delays. +[~sunilg]/[~vvasudev], they are looking at similar issues now while doing perf tests of YARN-3926. Thoughts? > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952107#comment-15952107 ] Roni Burd commented on YARN-6245: - We have run SLS for a couple of weeks, and we observed that when we have 4500 , the overhead of Resource accounts for a >60% of latencies. Thought profiling, we found that this is because of 2 issues, some identified here: 1) Resource is backed by protobuf, which is super expensive 2) Resource is treated 'most' of the time as immutable when doing calculations. (for example, use of ResourceCalculator.add() instead of addTo(). T Point #2 is especially hard. We observed that in our scenario, where we have 20K scheduler keys and 4500 machines, some of the methods like getHeaderoom or computeUserLimits can get called almost 16million times, and if you look at the method like nromalize() or normlizeAndRoundUp() etc, they each instantiate a new Resource object. If you take that into account, you end up with of Resource, NetworkResource, etc getting called >40,000,000 times. This rate is too high and it creates unnecessary OldGen GCs. I would like to propose in addition to FinalResource a LightResource, that is not only *not* backed by protobuf, but that it is also settable. In addition, we would need to add some methods (like addTo/add pair) that instead of creating a new object, simply set the properties of the old one to use in several places in the capacity scheduler where math is being performed only to be compared in an IF statement later. Let me know what you think... > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951449#comment-15951449 ] Wangda Tan commented on YARN-6245: -- [~kasha], Sorry for the long delay of response, I missed your previous comment. I think we are talking about two different things. 1) The FinalResource object is target to solve the overhead of PBImpl. It's a read-only copy of the original PB-based resource object. 2) The getObservableCopy can achieve the similar object however it cannot solve the multi-ops calculation, like {{d = (res_a * float_x + res_b) / res_c}}. During the calculation, we need to create some temporary resource instances. A observable copy is not enough because we need to write resource values of these temp resource instances. I agree YARN-3926 is tricker, it might be more expensive to construct "FinalResource". We need to be careful to get comparable performance. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890378#comment-15890378 ] Karthik Kambatla commented on YARN-6245: [~leftnoteasy] - I didn't understand you question about immutable_add. In FairScheduler, there is a lot of {{Resources.addTo}}. This can be performed on a {{Resource}} and the corresponding getter can return {{Resource.getObservableCopy}}. YARN-3926 needs to be incorporated carefully. I haven't looked at the code there, but will we still be using Resource? If yes, we will have to implement all the methods in Resource. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: observable-resource.patch, > YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886660#comment-15886660 ] Wangda Tan commented on YARN-6245: -- Thanks [~kasha], bq. What do you think of an approach where ImmutableResource extends Resource but throws an UnsupportedException on modification? Sounds good to me. bq. Resource could have a getImmutableCopy() method that returns a copy. So does it mean: {{immutable_add(a, b) = new ImmutableResource(add(a.getImmutable(), b.getImmutable()))}}? If so, this approach might be inefficient since it will generate lots of immutable objects. And under the context of YARN-3926, I'm not sure what is the most efficient way to handle ImmutableResource object with extensible types. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886640#comment-15886640 ] Karthik Kambatla commented on YARN-6245: Barely skimmed through the patch. What do you think of an approach where ImmutableResource extends Resource but throws an UnsupportedException on modification? I have not measured it, but the instantiation might not be as expensive given there is no protobuf. Resource could have a getImmutableCopy() method that returns a copy. We could play around with creating this on every get call or only on an update. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6245) Add FinalResource object to reduce overhead of Resource class instancing
[ https://issues.apache.org/jira/browse/YARN-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886632#comment-15886632 ] Karthik Kambatla commented on YARN-6245: I am very much in favor of doing this; was even considering bringing it up. Due to the lack of an ImmutableResource, I see that we do a combination of (1) clone the Resource, (2) lock at every usage, or (3) to avoid the performance penalties, just leave the race around hoping no one would use that Resource inappropriately. > Add FinalResource object to reduce overhead of Resource class instancing > > > Key: YARN-6245 > URL: https://issues.apache.org/jira/browse/YARN-6245 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > Attachments: YARN-6245.preliminary-staled.1.patch > > > There're lots of Resource object creation in YARN Scheduler, since Resource > object is backed by protobuf, creation of such objects is expensive and > becomes bottleneck. > To address the problem, we can introduce a FinalResource (Is it better to > call it ImmutableResource?) object, which is not backed by PBImpl. We can use > this object in frequent invoke paths in the scheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org