rzo1 commented on issue #8704:
URL: https://github.com/apache/storm/issues/8704#issuecomment-4516737183

   This has been there since the Clojure days. I did inspect the history but 
didn't find anything useful. The shuffle got added in clojure with 2ec670bc7 - 
the method comment even says "returns list of list of slots, reverse sorted by 
number of slots" .
   
   I think what we are seeing here is:
   
   - The original sort implements first-fit-decreasing bin packing: demands
     (distributionToSortedAmounts) are sorted descending and matched against 
hosts sorted descending by
     free slots. The assignment loop only .peek()s the head host and skips a 
demand if the head can't
     satisfy it (it doesn't scan the rest). With the list sorted, the head is 
the host with the most
     slots, maximizing the chance of a fit.
     - The shuffle randomizes the head, so a host with too few slots can land 
in front and cause a demand
     to be skipped even when capacity exists elsewhere i.e. shuffle can produce 
worse packing than the
     sort alone.
     - The likely intent of shuffle most likely was to spread isolated 
topologies randomly across machines rather than  always packing onto the 
biggest hosts, but as written it simply nullifies the sort.
   
   Perhaps dropping the shuffle would be a valid way to go ahead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to