Hi -
We are concerned with some loads accumulating on certain workers, and having 
that negatively impact the efficiency of the system.

Would a CustomStreamGrouping that is based on the broadcast of load stats be a 
way to deal with this?

For example, in our system, the content stream and processing overhead can be 
adjusted by external systems, creating stream with processing costs that vary 
per tuple.

We are currently using shuffleGrouping, which means that there is a chance that 
the most “expensive” tuples will land on the same task for processing, which 
would be bad (for those tuples) since other tasks might be underutilized by 
comparison.

Fieldgrouping isn’t helpful since it may also route loads disproportionately 
across the cluster.

So, I’m wondering if we can create a CustomStreamGrouping that will:
- periodically post worker stats (heap/cpu usage) + worker task assignments to 
zookeeper
- periodically read worker stats + task assignments of other workers from 
zookeeper
- during chooseTasks() impl, refer to the worker stats to determine which 
worker is least-loaded, and use the taskIds for that worker to route the tuple.


I haven’t tried this yet, but I know there is an issue 
(https://issues.apache.org/jira/browse/STORM-162) created a while back 
suggesting to add load balancing to shuffle grouping, and this seems like a 
simpler alternative.

Thanks
Tyson

Reply via email to