BullDemonKing commented on issue #13588: Accelerate DGL csr neighbor sampling URL: https://github.com/apache/incubator-mxnet/pull/13588#issuecomment-445521810 The improvement in this PR has about 2-3 times speedup over the implementation in the master branch. The speed in the master branch: ``` -------------------------------------------------------Uniform Bechmark--------------------------------------------------------- time: 0.0651424407959 (s) thread: 32 seed size: 1000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 325274 time: 0.07034907341 (s) thread: 32 seed size: 2000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 647972 time: 0.130498409271 (s) thread: 32 seed size: 4000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 1300821 time: 0.251270961761 (s) thread: 32 seed size: 8000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 2598921 time: 0.453542613983 (s) thread: 32 seed size: 1000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 5276791 time: 0.940950870514 (s) thread: 32 seed size: 2000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 10515177 time: 1.6341050148 (s) thread: 32 seed size: 4000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 20668389 time: 2.81761116982 (s) thread: 32 seed size: 8000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 39889110 -----------------------------------------------------Non-Uniform Bechmark------------------------------------------------------- time: 0.0410866260529 (s) thread: 32 seed size: 1000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 323537 time: 0.0525182723999 (s) thread: 32 seed size: 2000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 649116 time: 0.108578252792 (s) thread: 32 seed size: 4000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 1301567 time: 0.210351467133 (s) thread: 32 seed size: 8000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 2599414 time: 0.483125591278 (s) thread: 32 seed size: 1000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 5334587 time: 0.998479986191 (s) thread: 32 seed size: 2000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 10499253 time: 1.81847443581 (s) thread: 32 seed size: 4000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 20587578 time: 3.07297439575 (s) thread: 32 seed size: 8000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 39759114 ``` The speed in this PR: ``` -------------------------------------------------------Uniform Bechmark--------------------------------------------------------- time: 0.0666475772858 (s) thread: 32 seed size: 1000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 326446 time: 0.0367893695831 (s) thread: 32 seed size: 2000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 649095 time: 0.0448307514191 (s) thread: 32 seed size: 4000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 1299422 time: 0.152100610733 (s) thread: 32 seed size: 8000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 2599672 time: 0.204541826248 (s) thread: 32 seed size: 1000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 5303749 time: 0.416042566299 (s) thread: 32 seed size: 2000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 10552953 time: 0.617377996445 (s) thread: 32 seed size: 4000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 20603133 time: 1.2061047554 (s) thread: 32 seed size: 8000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 39946812 -----------------------------------------------------Non-Uniform Bechmark------------------------------------------------------- time: 0.033527803421 (s) thread: 32 seed size: 1000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 323132 time: 0.0407970905304 (s) thread: 32 seed size: 2000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 651792 time: 0.0628390789032 (s) thread: 32 seed size: 4000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 1300238 time: 0.100573396683 (s) thread: 32 seed size: 8000 num_hops: 1 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 2598552 time: 0.259836006165 (s) thread: 32 seed size: 1000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 5318377 time: 0.468389415741 (s) thread: 32 seed size: 2000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 10490639 time: 0.778582191467 (s) thread: 32 seed size: 4000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 20579780 time: 1.31772165298 (s) thread: 32 seed size: 8000 num_hops: 2 num neighbors: 16 max vertices: 1000000 sample_num_ver: 32000032 sample_num_edge: 39680991 ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
