[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170288951 ## File path: src/ndarray/ndarray.cc ## @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const NDArray& to, RunContext // with Copy(). NDArray tmp_from = from; if (tmp_from.IsMKLDNNData()) { + // TODO(zhengda) tmp_from should be cached. Review comment: There are many decaffeinated brands out there that are just as tasty as the real thing. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170162245 ## File path: src/operator/nn/mkldnn/mkldnn_base.cc ## @@ -270,9 +270,24 @@ void FallBackCompute(FCompute fn, const nnvm::NodeAttrs , const std::vector , const std::vector ) { std::vector in_blobs(inputs.size()); + std::vector in_bufs; for (size_t i = 0; i < in_blobs.size(); i++) { +// If the input data isn't stored in the default format, we shouldn't +// call data() directly, which will change the layout of the NDArray. +// Instead, we should save the converted data in another NDArray. +// TODO(zhengda) we should use temp space to save the converted data. +if (inputs[i].IsDefaultData()) { in_blobs[i] = inputs[i].data(); +} else { + in_bufs.emplace_back(inputs[i].shape(), inputs[i].ctx(), Review comment: implementations vary, but this gives a general idea: http://www.drdobbs.com/c-made-easier-how-vectors-grow/184401375 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170161984 ## File path: src/ndarray/ndarray.cc ## @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const NDArray& to, RunContext // with Copy(). NDArray tmp_from = from; if (tmp_from.IsMKLDNNData()) { + // TODO(zhengda) tmp_from should be cached. Review comment: if it doesn?t require const_cast<>, then it?s perfectly legal. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170161424 ## File path: src/operator/nn/mkldnn/mkldnn_base.cc ## @@ -270,9 +270,24 @@ void FallBackCompute(FCompute fn, const nnvm::NodeAttrs , const std::vector , const std::vector ) { std::vector in_blobs(inputs.size()); + std::vector in_bufs; for (size_t i = 0; i < in_blobs.size(); i++) { +// If the input data isn't stored in the default format, we shouldn't +// call data() directly, which will change the layout of the NDArray. +// Instead, we should save the converted data in another NDArray. +// TODO(zhengda) we should use temp space to save the converted data. +if (inputs[i].IsDefaultData()) { in_blobs[i] = inputs[i].data(); +} else { + in_bufs.emplace_back(inputs[i].shape(), inputs[i].ctx(), Review comment: when you add items to a vector, when it runs out of space for the items (which generally happens on the second item sometimes), it allocates memory for 2x the current number of items and then copies all of the items (copy constructor or move, depending on a number of things), into to the new space and then frees the old space. this can happen over and over again, and the whole reallocate and copy can be very expensive. they tend to even show up as hotspots in vtune. so if you know how many items a vector may have, it?s always a good idea to call reserve() because otherwise you are guaranteed a lot of extra reallocing And copying. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170117620 ## File path: tests/python/gpu/test_gluon_model_zoo_gpu.py ## @@ -37,11 +37,11 @@ def download_data(): return mx.test_utils.download( 'http://data.mxnet.io/data/val-5k-256.rec', VAL_DATA) -@unittest.skip("test fails intermittently. temporarily disabled.") @with_seed() def test_inference(): all_models = ['resnet50_v1', 'vgg19_bn', 'alexnet', #'inceptionv3', 'densenet201', 'squeezenet1.0', 'mobilenet0.25'] +mx.random.seed(2) Review comment: Why would you want to set it the same for each run? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170117170 ## File path: src/operator/nn/mkldnn/mkldnn_base.cc ## @@ -270,9 +270,24 @@ void FallBackCompute(FCompute fn, const nnvm::NodeAttrs , const std::vector , const std::vector ) { std::vector in_blobs(inputs.size()); + std::vector in_bufs; for (size_t i = 0; i < in_blobs.size(); i++) { +// If the input data isn't stored in the default format, we shouldn't +// call data() directly, which will change the layout of the NDArray. +// Instead, we should save the converted data in another NDArray. +// TODO(zhengda) we should use temp space to save the converted data. +if (inputs[i].IsDefaultData()) { in_blobs[i] = inputs[i].data(); +} else { + in_bufs.emplace_back(inputs[i].shape(), inputs[i].ctx(), + false, inputs[i].dtype()); + auto mem = inputs[i].GetMKLDNNData(); Review comment: nit: auto should be used for obvious types (ie auto foo = static_cast(bar)) or stuff like iterators which are super-long. This is just a simnple var. For readability here, it's be awesome to see what this type actually is. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.
cjolivier01 commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN. URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170116831 ## File path: src/operator/nn/mkldnn/mkldnn_base.cc ## @@ -270,9 +270,24 @@ void FallBackCompute(FCompute fn, const nnvm::NodeAttrs , const std::vector , const std::vector ) { std::vector in_blobs(inputs.size()); + std::vector in_bufs; for (size_t i = 0; i < in_blobs.size(); i++) { +// If the input data isn't stored in the default format, we shouldn't +// call data() directly, which will change the layout of the NDArray. +// Instead, we should save the converted data in another NDArray. +// TODO(zhengda) we should use temp space to save the converted data. +if (inputs[i].IsDefaultData()) { in_blobs[i] = inputs[i].data(); +} else { + in_bufs.emplace_back(inputs[i].shape(), inputs[i].ctx(), Review comment: nit: it would be nice to call in_bufs.reserve(in_blobs.size()) on this first pass (ie if in_bufs.empty()) If there is often more than one in_bufs to be inserted This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services