Hi I'm investigating this issue: https://github.com/apache/incubator-mxnet/issues/12994
To me this code seems suspicious, as it doesn't do what is stated in the comment. https://github.com/apache/incubator-mxnet/blob/master/src/kvstore/gpu_topology.h#L577 I don't think the depth of the binary tree is calculated correctly, for example a tree of three nodes should have two leves, but a tree of four nodes should have three. A tree of 0 should have 0. Any ideas if this is indeed buggy? or there's something hidden I'm missing? Test code to check: #include <iostream> #include <string> #include <cstdlib> #include <cassert> #include <vector> #include <stdexcept> using namespace std; inline int ComputeDepth(int n) { for (int depth = 0; depth < 16; ++depth) { int num = 2 << depth; if (n <= num) return depth+1; } return 0; } int main(int argc, char *argv[]) { for (size_t i=0; i<64; ++i) cout << "ComputeDepth(" << i << ") = " << ComputeDepth(i) << endl; } ComputeDepth(0) = 1 ComputeDepth(1) = 1 ComputeDepth(2) = 1 ComputeDepth(3) = 2 ComputeDepth(4) = 2 ComputeDepth(5) = 3 ComputeDepth(6) = 3 ComputeDepth(7) = 3 ComputeDepth(8) = 3 ComputeDepth(9) = 4 ComputeDepth(10) = 4 ComputeDepth(11) = 4 ComputeDepth(12) = 4 ComputeDepth(13) = 4 ComputeDepth(14) = 4 ComputeDepth(15) = 4 ComputeDepth(16) = 4 ComputeDepth(17) = 5 ComputeDepth(18) = 5