[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-27 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r171123124
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -375,7 +375,45 @@ void NDArray::Chunk::Reorder2Default() {
   CheckAndAlloc(def_pd.get_size());
   // TODO(zhengda) We need to avoid memory copy here.
   memcpy(shandle.dptr, def_mem->get_data_handle(), def_pd.get_size());
-  mkl_mem_.reset(new mkldnn::memory(def_pd, shandle.dptr));
+  mkl_mem_ = nullptr;
+}
+
+void NDArray::Chunk::MKLDNNDataReorder(const mkldnn::memory::primitive_desc 
) {
+  // If the memory already uses the specified layout, don't do anything.
+  if (mkl_mem_ != nullptr && mkl_mem_->get_primitive_desc() == pd)
+return;
+  auto _pd = pd;
+  auto _desc = _pd.desc();
+  auto def_format = GetDefaultFormat(_desc);
+  // If the memory is default, don't do anything.
+  if (def_format == _desc.data.format && IsDefault())
+return;
+  // If the specified layout is default, we should use Reorder2Default.
+  if (def_format == _desc.data.format) {
+Reorder2Default();
+return;
+  }
+
+  std::shared_ptr new_mem(new mkldnn::memory(pd));
+  std::shared_ptr old_mem;
+  if (IsDefault()) {
+auto def_pd = GetPrimitiveDesc(pd, def_format);
+old_mem.reset(new mkldnn::memory(def_pd, shandle.dptr));
+  } else {
+old_mem = this->mkl_mem_;
+  }
+  CHECK(old_mem->get_primitive_desc().desc().data.ndims == _desc.data.ndims);
+
+  // This may be called in MKLDNN operators. We can't use MKLDNNStream here.
+  std::vector net;
+  net.push_back(mkldnn::reorder(*old_mem, *new_mem));
+  mkldnn::stream(mkldnn::stream::kind::eager).submit(net).wait();
+
+  CHECK(shandle.size >= pd.get_size());
+  CheckAndAlloc(pd.get_size());
+  // TODO(zhengda) We need to avoid memory copy here.
 
 Review comment:
   Open TODO


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-27 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r171123210
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const 
NDArray& to, RunContext
 // with Copy().
 NDArray tmp_from = from;
 if (tmp_from.IsMKLDNNData()) {
+  // TODO(zhengda) tmp_from should be cached.
 
 Review comment:
   Open TODO


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-23 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170297168
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const 
NDArray& to, RunContext
 // with Copy().
 NDArray tmp_from = from;
 if (tmp_from.IsMKLDNNData()) {
+  // TODO(zhengda) tmp_from should be cached.
 
 Review comment:
   Whatever you guys are talking here... I think it's a pretty basic thing that 
immutable means immutable and that a getter should never modify the underlying 
data in any way. I have no clue about C++, but I'm quite certain that this 
constraint is quite clearly defined across the industry. Could we please get 
back to a productive discussion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170132857
 
 

 ##
 File path: tests/python/gpu/test_gluon_model_zoo_gpu.py
 ##
 @@ -37,8 +37,7 @@ def download_data():
 return mx.test_utils.download(
 'http://data.mxnet.io/data/val-5k-256.rec', VAL_DATA)
 
-@unittest.skip("test fails intermittently. temporarily disabled.")
-@with_seed()
+@with_seed(1)
 
 Review comment:
   Please remove custom seed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170132926
 
 

 ##
 File path: tests/python/gpu/test_gluon_model_zoo_gpu.py
 ##
 @@ -99,7 +100,7 @@ def get_nn_model(name):
 # Seed 1521019752 produced a failure on the Py2 MKLDNN-GPU CI runner
 # on 2/16/2018 that was not reproducible.  Problem could be timing related or
 # based on non-deterministic algo selection.
-@with_seed()
+@with_seed(1)
 
 Review comment:
   Please remove custom seed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170132782
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const 
NDArray& to, RunContext
 // with Copy().
 NDArray tmp_from = from;
 if (tmp_from.IsMKLDNNData()) {
+  // TODO(zhengda) tmp_from should be cached.
 
 Review comment:
   In think it's best practice and that we should lead by example :) In the 
last months I haven't seen any TODO (besides the ones in your MKLDNN PR) 
checked into MXNet. 
   
   We got rid of the script to automatically close issues. For now, GitHub 
issues are the right place to track these things.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170119050
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const 
NDArray& to, RunContext
 // with Copy().
 NDArray tmp_from = from;
 if (tmp_from.IsMKLDNNData()) {
+  // TODO(zhengda) tmp_from should be cached.
 
 Review comment:
   Yes. Ideally, there should be no TODOs at all. But in cases like this it's 
acceptable to have them, but they must be tracked as otherwise we lose overview.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170113340
 
 

 ##
 File path: tests/python/gpu/test_gluon_model_zoo_gpu.py
 ##
 @@ -37,11 +37,11 @@ def download_data():
 return mx.test_utils.download(
 'http://data.mxnet.io/data/val-5k-256.rec', VAL_DATA)
 
-@unittest.skip("test fails intermittently. temporarily disabled.")
 @with_seed()
 def test_inference():
 all_models = ['resnet50_v1', 'vgg19_bn', 'alexnet', #'inceptionv3',
   'densenet201', 'squeezenet1.0', 'mobilenet0.25']
+mx.random.seed(2)
 
 Review comment:
   please don't use mx.random.seed due to the introduction of the @with_seed() 
decorator


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170113089
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -1017,6 +1017,7 @@ inline void CopyFromToDnsImpl(const NDArray& from, const 
NDArray& to, RunContext
 // with Copy().
 NDArray tmp_from = from;
 if (tmp_from.IsMKLDNNData()) {
+  // TODO(zhengda) tmp_from should be cached.
 
 Review comment:
   Please create an issue to track this


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on a change in pull request #9862: Fix a race condition in converting data layouts in MKLDNN.

2018-02-22 Thread GitBox
marcoabreu commented on a change in pull request #9862: Fix a race condition in 
converting data layouts in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/9862#discussion_r170113380
 
 

 ##
 File path: tests/python/gpu/test_gluon_model_zoo_gpu.py
 ##
 @@ -105,6 +107,7 @@ def test_training():
 # TODO(zhengda) mobilenet can't pass this test even without MKLDNN.
 all_models = ['resnet18_v1', 'densenet121']
 
+mx.random.seed(1)
 
 Review comment:
   please don't use mx.random.seed due to the introduction of the @with_seed() 
decorator


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services