[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta

2018-02-20 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-367150719 Just in case try synthetic data with `--benchmark 1` - with 24 threads I bet you are still

[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta

2018-02-20 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-367138608 I was asking about the imagenet script. If you use smaller batch size like 256 for 8 GPUs

[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta

2018-02-20 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument / FP16 performance on Volta URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-367057821 @rahul003 Could you paste here how you invoked the benchmark script? Did you set the

[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument

2018-02-14 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-365780010 There are few possible explanations. The most probable reason is workspace size for convolutions. I tried pitching

[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument

2018-02-13 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-365484238 Engine does not seem to differentiate between first layer and subsequent layers on that it considers data going into

[GitHub] ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument

2018-02-13 Thread GitBox
ptrendx commented on issue #9774: mx.io.ImageRecordIter does not respect dtype argument URL: https://github.com/apache/incubator-mxnet/issues/9774#issuecomment-365483069 I don't think it will perform better than producing fp32 and then casting to fp16 at the beginning of the training.