ptrendx commented on issue #7778: float16 has no performance improvement URL: https://github.com/apache/incubator-mxnet/issues/7778#issuecomment-327687093 V100 as in Volta? Then the speed you see is quite low even on fp32, so I would say you are limited by IO here. Default number of threads for IO is 4 which is way to low to saturate 4 V100 - could you test with --data-nthreads N option (N should be 10+ for fp16 version at least). Also, if that does not solve the issue, could you test single GPU numbers? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
With regards, Apache Git Services
