[ 
https://issues.apache.org/jira/browse/STORM-132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092999#comment-14092999
 ] 

ASF GitHub Bot commented on STORM-132:
--------------------------------------

Github user d2r commented on a diff in the pull request:

    https://github.com/apache/incubator-storm/pull/36#discussion_r16066139
  
    --- Diff: storm-core/test/clj/backtype/storm/scheduler_test.clj ---
    @@ -259,3 +260,27 @@
         (is (= false (.isSlotOccupied cluster (WorkerSlot. "supervisor1" (int 
3)))))
         (is (= false (.isSlotOccupied cluster (WorkerSlot. "supervisor1" (int 
5)))))
         ))
    +
    +(deftest test-sort-slots
    +  ;; test supervisor2 has more free slots
    +  (is (= '(["supervisor2" 6700] ["supervisor1" 6700]
    +           ["supervisor2" 6701] ["supervisor1" 6701]
    +           ["supervisor2" 6702])
    +         (sort-slots [["supervisor1" 6700] ["supervisor1" 6701]
    +                      ["supervisor2" 6700] ["supervisor2" 6701] 
["supervisor2" 6702]
    +                      ])))
    +  ;; test supervisor3 has more free slots
    +  (is (= '(["supervisor3" 6700] ["supervisor2" 6700] ["supervisor1" 6700]
    +           ["supervisor3" 6701] ["supervisor2" 6701] ["supervisor1" 6701]
    +           ["supervisor3" 6702] ["supervisor2" 6702]
    +           ["supervisor3" 6703])
    +         (sort-slots [["supervisor1" 6700] ["supervisor1" 6701]
    +                      ["supervisor2" 6700] ["supervisor2" 6701] 
["supervisor2" 6702]
    +                      ["supervisor3" 6700] ["supervisor3" 6701] 
["supervisor3" 6702] ["supervisor3" 6703]
    +                      ])))
    +  ;; test supervisor1 and supervisor2 has the same free slot
    +  (is (= '(["supervisor1" 6700] ["supervisor2" 6700]
    +           ["supervisor1" 6701] ["supervisor2" 6701])
    +         (sort-slots [["supervisor1" 6700] ["supervisor1" 6701] 
["supervisor2" 6700] ["supervisor2" 6701]])))
    +    )
    --- End diff --
    
    This test passes with or without your change.  Do we need this test?


> Default Storm scheduler should evenly use resources when managing multiple 
> topologies
> -------------------------------------------------------------------------------------
>
>                 Key: STORM-132
>                 URL: https://issues.apache.org/jira/browse/STORM-132
>             Project: Apache Storm (Incubating)
>          Issue Type: Improvement
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/359
> Currently, a single topology is evenly spread across the cluster, but this is 
> not the case for multiple topologies (it targets one node first, then the 
> rest). The default scheduler should order the hosts for scheduling in terms 
> of which has the least slots used.
> ----------
> lyogavin: To confirm, we want to firstly look at the number of used slots on 
> each host. Firstly try to balance the used slot number across all the hosts. 
> For the assignment inside each host, we also try to evenly balance the usage 
> of each slot. Am I understanding correctly?
> When i coding, i realized it's actually a little complicated. Looks like 
> there are many policies here we want to consider:
> 1. Evenness of resource usage. (Do we want to evaluate evenness according to 
> number of slots used in each host or the number of executors? Maybe number of 
> executors is better, but also make it very complicated)
> 2. Least rescheduling. We probabaly also want to make the assignment change 
> as less as possible. Looks like this is why DefaultScheduler.bad-slots is 
> coded the current way.
> 3. Number of workers.
> Then the question is how do we prioritize those policies. Sometimes they 
> conflict to each other. For example, sometimes the most even distribution may 
> need the most reassignment.
> Any thoughts?
> ----------
> xumingming: @lyogavin I think you might have over-thought this, as 
> @nathanmarz already confirmed, you just need to update 
> EvenScheduler.sort-slots to make sure the slots in the least used node appear 
> first in the available-slots list.
> ----------
> lyogavin: Thanks James. I got what you mean. So i'll not worry about the 
> usage of the executor usage. Only consider the balance of slots usage.
> But I think just simply change the sort-slots to sort the slots based on 
> slots usage may still not work too well. For example, let's say there are 2 
> hosts, with 10 slots on each. Say 1st one used 1 slots, 2nd one used 2 slots, 
> and we want to assign another topology with 8 workers. If we simply sort the 
> slots based on usage, we'll end up with the list [0 0 0 0 0 0 0 0 1 1], then 
> the new assignment we'll get would be all from the 1st host. Not balanced.
> So in the above pull request, I implemented a solution in the way similar to 
> watershed algorithm. It would firstly pick the slots from least used host, 
> until that host uses the same number of slots as the second least used slots. 
> Then it evenly picks slots from the 2 least used hosts until reaches the 3rd 
> one. Iterating this way, we can get the best balanced assignment.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to