[
https://issues.apache.org/jira/browse/HADOOP-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671754#action_12671754
]
Matei Zaharia commented on HADOOP-5185:
---------------------------------------
As a temporary fix, feel free to submit a patch that scales up the interval
based on cluster size or heartbeat interval. Or, if there's a way to make
getTotalSlots non-synchronized or cache its result, we should do that, as there
is no reason to call this method all the time.
Incidentally, if we change the fair scheduler logic to not use deficits anymore
(which I'm proposing in HADOOP-4803 and seems like a better idea the more I
think of it), the update thread could start running much less frequently. The
reason it runs so often now is to make the deficit computations accurate so
that we don't have too many tasks per job starting/finishing in-between update
calls. If we removed deficits, I think the main reason we'd need periodic
updates will be preemption, and that check can happen much less frequently.
> Upate thread in FairScheduler runs too frequently
> -------------------------------------------------
>
> Key: HADOOP-5185
> URL: https://issues.apache.org/jira/browse/HADOOP-5185
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/fair-share
> Reporter: Vinod K V
>
> The UpdateThread in FairScheduler runs every 500ms (hardcoded). This proves
> to be very costly when running large clusters. UpdateThread tries to acquire
> lock on JT object every that often and so seriously affects HeartBeat
> processing besides everything else. The update interval should be a function
> of the cluster size. Or in the minimum it should be configurable and by
> default should be set to a reasonably high default value.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.