rzo1 commented on issue #8704:
URL: https://github.com/apache/storm/issues/8704#issuecomment-4516737183
This has been there since the Clojure days. I did inspect the history but
didn't find anything useful. The shuffle got added in clojure with 2ec670bc7 -
the method comment even says "returns list of list of slots, reverse sorted by
number of slots" .
I think what we are seeing here is:
- The original sort implements first-fit-decreasing bin packing: demands
(distributionToSortedAmounts) are sorted descending and matched against
hosts sorted descending by
free slots. The assignment loop only .peek()s the head host and skips a
demand if the head can't
satisfy it (it doesn't scan the rest). With the list sorted, the head is
the host with the most
slots, maximizing the chance of a fit.
- The shuffle randomizes the head, so a host with too few slots can land
in front and cause a demand
to be skipped even when capacity exists elsewhere i.e. shuffle can produce
worse packing than the
sort alone.
- The likely intent of shuffle most likely was to spread isolated
topologies randomly across machines rather than always packing onto the
biggest hosts, but as written it simply nullifies the sort.
Perhaps dropping the shuffle would be a valid way to go ahead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]