eric-haibin-lin commented on a change in pull request #15124: [MXNET-1294] 
Priority-based parameter propagation for improved data parallel training 
throughput
URL: https://github.com/apache/incubator-mxnet/pull/15124#discussion_r291448734
 
 

 ##########
 File path: src/kvstore/kvstore_dist.h
 ##########
 @@ -544,31 +592,25 @@ class KVStoreDist : public KVStoreLocal {
       const int num_servers = krs.size();
       CHECK_GT(num_servers, 0);
 
-      // a simple heuristic for load balance
-      if (num_arr_elems < bigarray_bound_) {
-        // send it to a single random picked server
-        int server = (key * 9973) % num_servers;
-        ps::Key ps_key = krs[server].begin() + key;
-        CHECK_LT(ps_key, krs[server].end());
+      /**
+       * Round-Robin key assignment
+       */
+      int64_t params = pskv_size;
+      int64_t slice_bound = bigarray_bound_ * num_bytes;
+      static ps::Key server = 0;
 
 Review comment:
   should we pick a random server to start? It looks like the assignment bias 
towards server 0. If we have many small parameters (bias terms), they're more 
likely to be assigned to server 0 only? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to