[ 
https://issues.apache.org/jira/browse/MESOS-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835725#comment-16835725
 ] 

Meng Zhu commented on MESOS-9725:
---------------------------------

Based on a recent internal test, the sort() does not take much time. And this 
ticket would introduce some extra complexities.

The review above (https://reviews.apache.org/r/70497/) is pretty ready except 
one issue that still needs to figure out. In the review, we used a hashmap and 
used double as the key. This worries us because of the double precision issue. 
A solution is to use rational numbers. 

Given the benefit and complexity of the patch, we decided to shelve it for now. 
Move this ticket back to `accepted`.

> Perform incremental sorting in the random sorter.
> -------------------------------------------------
>
>                 Key: MESOS-9725
>                 URL: https://issues.apache.org/jira/browse/MESOS-9725
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Meng Zhu
>            Assignee: Meng Zhu
>            Priority: Major
>              Labels: performance, resource-management
>
> By doing random sampling every time as the caller asks for the next client 
> (See MESOS-9722) we could avoid the cost of full shuffling and only pay as we 
> go.
> While the hope is to do each random sampling with O(1) cost, the presence of 
> weights complicates the matter. We will need to pay O(log( n )) for every 
> sample even with fancy data structures like segment tree or binary index 
> trees (naive ones will result in O( n ) since we need to look at every node's 
> weights). And the current full node shuffling is already optimal (nlog( n )) 
> if all nodes are picked.
> However, since the number of *distinct* weights is usually much smaller 
> comparing to the size of clients, we can minimize the sample cost by picking 
> a client in two steps:
> Step1: randomly pick a group of clients that has the same weight by 
> generating a weighted random number.
> Step2: Once a vector of clients is chosen, randomly sample a specific client 
> within the group. Since all the clients in the chosen vector have the same 
> weight, we do not need to consider any weights.
>  
> Since the size of distinct weights is usually much smaller comparing to the 
> size of clients, this way, we minimize the cost of generating weighted random 
> numbers which are linear with the size of weights.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to