roywei edited a comment on issue #15099: Revert "Improve FC perf when no_bias=False" URL: https://github.com/apache/incubator-mxnet/pull/15099#issuecomment-497227403 @stu1130 has tried with ~2G and failed. It also seems 8G in #15100 failed. I just tried 10G passed ``` ci/build.py --docker-registry mxnetci --platform ubuntu_build_cuda --docker-build-retries 3 --shm-size 10000m /work/runtime_functions.sh build_ubuntu_gpu_mkldnn ``` We have tried with 500m before https://github.com/apache/incubator-mxnet/pull/15033, everything works fine. Everything after this commit failed with 500m. For local tests, also able to reproduce now with @anirudh2290 's command. So this can't be only shared memory related. 1. why https://github.com/apache/incubator-mxnet/pull/15033 code change caused CI requiring more space? (it changed requiring additional temp space to not limit to MKLDNN only, but should not affect build?) 2. why CI check passed for initial PR? 3. how to fix local build? I would prefer to merge this and keep https://github.com/apache/incubator-mxnet/issues/15084 open until we find root cause.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
