Github user HeartSaVioR commented on the issue:
https://github.com/apache/storm/pull/2261
Now I introduce 'skip checking update count' to avoid calling
System.currentTimeMillis() every time, but it has clear trade-off, we should
call AtomicInteger.incrementAndGet() every time.
I set the skip checking update index to 10, 100, 1000, 10000, 100000, and
re-run tests. Test result is below:
> 10
>> testBenchmarkLoadAwareShuffleGroupingEvenLoad
Duration: 48650 ms
Duration: 49058 ms
Duration: 48445 ms
>> testBenchmarkLoadAwareShuffleGroupingEvenLoadAndMultiThreaded
Max duration among threads is : 159369 ms
Max duration among threads is : 130093 ms
Max duration among threads is : 138557 ms
> 100
>> testBenchmarkLoadAwareShuffleGroupingEvenLoad
Duration: 41093 ms
Duration: 40393 ms
Duration: 40524 ms
>> testBenchmarkLoadAwareShuffleGroupingEvenLoadAndMultiThreaded
Max duration among threads is : 142575 ms
Max duration among threads is : 139276 ms
Max duration among threads is : 145470 ms
> 1000
>> testBenchmarkLoadAwareShuffleGroupingEvenLoad
Duration: 40238 ms
Duration: 39715 ms
Duration: 39242 ms
>> testBenchmarkLoadAwareShuffleGroupingEvenLoadAndMultiThreaded
Max duration among threads is : 168089 ms
Max duration among threads is : 161082 ms
Max duration among threads is : 169998 ms
> 10000
>> testBenchmarkLoadAwareShuffleGroupingEvenLoad
Duration: 40535 ms
Duration: 39319 ms
Duration: 46815 ms
>> testBenchmarkLoadAwareShuffleGroupingEvenLoadAndMultiThreaded
Max duration among threads is : 140426 ms
Max duration among threads is : 166214 ms
Max duration among threads is : 169368 ms
> 100000
>> testBenchmarkLoadAwareShuffleGroupingEvenLoad
Duration: 39801 ms
Duration: 39535 ms
Duration: 39537 ms
>> testBenchmarkLoadAwareShuffleGroupingEvenLoadAndMultiThreaded
Max duration among threads is : 147115 ms
Max duration among threads is : 140722 ms
Max duration among threads is : 172955 ms
Test result seems to fluctuate, but we can see that the change is good
roughly.
Now multi-threads fluctuates more and hurts on performance compared to
before, but still faster than old LASG's. What we really get is performance
improvement with single-thread: we reduced more than half of time than before.
The value of âskip checking update count' should be reasonable to answer
the question: "Are we OK to delay updating load information if less than N (the
value) calls occurred within 1 sec?" We may want to put better efforts to find
the value (given that test results was not stable enough), but at least from
test result, 100 seems be a good value. Higher value doesn't show linear
performance improvement.
Btw, update duration (M secs) is another variable to explore. Maybe also
need to see how often origin load information gets updated, since it is
meaningless that LASG updates the information more often then origin load
information gets updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---