johnbroughton2017 commented on issue #9884: How to speed up prediction run time? Copying gpu->cpu takes a long time URL: https://github.com/apache/incubator-mxnet/issues/9884#issuecomment-368356112 Follow-up. Found this more interesting. Using caffenet instead of resnet50, it looks like this: batch size mod.forward() (ms) mod.get_outputs...asnumpy() (ms) ------------ -------------------- ---------------------------------- 16 156.6 61.3 32 183.4 28.9 48 166.4 25.3 64 166.7 32.1 80 171.3 38.6 96 181.8 33.4 112 181.4 41.6 128 188.2 46.8 144 236.5 61.2 160 193.1 54.4 176 195.8 61.8 192 198.9 65.9 208 196.7 70.3 224 199.5 75.3 240 203.5 77.4 256 206 81.9 The output dimension should be the same but for some reason the data copying time is reduced a lot. Cannot figure out why -- John
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services