[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174324064
 
 

 ##
 File path: tests/python/unittest/test_gluon_data.py
 ##
 @@ -112,6 +117,76 @@ def test_multi_worker():
 for i, batch in enumerate(loader):
 assert (batch.asnumpy() == i).all()
 
+@with_seed()
+def test_multi_worker_data_loader():
+class Dummy(Dataset):
+def __init__(self, random_shape):
+self.random_shape = random_shape
+
+def __getitem__(self, idx):
+key = idx
+if self.random_shape:
+out = np.random.uniform(size=(random.randint(1000, 1100), 40))
+labels = np.random.uniform(size=(random.randint(10, 15)))
+else:
+out = np.random.uniform(size=(1000, 40))
+labels = np.random.uniform(size=(10))
+return key, out, labels
+
+def __len__(self):
+return 50
+
+def batchify(self, data):
+"""
+Collate data into batch. Use shared memory for stacking.
+
+:param data: a list of array, with layout of 'NTC'.
+:return either x  and x's unpadded lengths, or x, x's unpadded 
lengths, y and y's unpadded lengths
+if labels are not supplied.
+"""
+
+# input layout is NTC
+keys, inputs, labels = [item[0] for item in data], [item[1] for 
item in data], \
+   [item[2] for item in data]
+
+if len(data) > 1:
+max_data_len = max([seq.shape[0] for seq in inputs])
+max_labels_len = 0 if not labels else max([seq.shape[0] for 
seq in labels])
+else:
+max_data_len = inputs[0].shape[0]
+max_labels_len = 0 if not labels else labels[0].shape[0]
+
+x_lens = [item.shape[0] for item in inputs]
+y_lens = [item.shape[0] for item in labels]
+
+for i, seq in enumerate(inputs):
+pad_len = max_data_len - seq.shape[0]
+inputs[i] = np.pad(seq, ((0, pad_len), (0, 0)), 'constant', 
constant_values=0)
+labels[i] = np.pad(labels[i], (0, max_labels_len - 
labels[i].shape[0]),
+   'constant', constant_values=-1)
+
+inputs = np.asarray(inputs, dtype=np.float32)
+if labels is not None:
+labels = np.asarray(labels, dtype=np.float32)
+inputs = inputs.transpose((1, 0, 2))
+labels = labels.transpose((1, 0))
+
+return (nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
 
 Review comment:
   What's being tested is the fork()


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174307840
 
 

 ##
 File path: tests/python/unittest/test_gluon_data.py
 ##
 @@ -112,6 +117,76 @@ def test_multi_worker():
 for i, batch in enumerate(loader):
 assert (batch.asnumpy() == i).all()
 
+@with_seed()
+def test_multi_worker_data_loader():
+class Dummy(Dataset):
+def __init__(self, random_shape):
+self.random_shape = random_shape
+
+def __getitem__(self, idx):
+key = idx
+if self.random_shape:
+out = np.random.uniform(size=(random.randint(1000, 1100), 40))
+labels = np.random.uniform(size=(random.randint(10, 15)))
+else:
+out = np.random.uniform(size=(1000, 40))
+labels = np.random.uniform(size=(10))
+return key, out, labels
+
+def __len__(self):
+return 50
+
+def batchify(self, data):
+"""
+Collate data into batch. Use shared memory for stacking.
+
+:param data: a list of array, with layout of 'NTC'.
+:return either x  and x's unpadded lengths, or x, x's unpadded 
lengths, y and y's unpadded lengths
+if labels are not supplied.
+"""
+
+# input layout is NTC
+keys, inputs, labels = [item[0] for item in data], [item[1] for 
item in data], \
+   [item[2] for item in data]
+
+if len(data) > 1:
+max_data_len = max([seq.shape[0] for seq in inputs])
+max_labels_len = 0 if not labels else max([seq.shape[0] for 
seq in labels])
+else:
+max_data_len = inputs[0].shape[0]
+max_labels_len = 0 if not labels else labels[0].shape[0]
+
+x_lens = [item.shape[0] for item in inputs]
+y_lens = [item.shape[0] for item in labels]
+
+for i, seq in enumerate(inputs):
+pad_len = max_data_len - seq.shape[0]
+inputs[i] = np.pad(seq, ((0, pad_len), (0, 0)), 'constant', 
constant_values=0)
+labels[i] = np.pad(labels[i], (0, max_labels_len - 
labels[i].shape[0]),
+   'constant', constant_values=-1)
+
+inputs = np.asarray(inputs, dtype=np.float32)
+if labels is not None:
+labels = np.asarray(labels, dtype=np.float32)
+inputs = inputs.transpose((1, 0, 2))
+labels = labels.transpose((1, 0))
+
+return (nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
 
 Review comment:
   I meant Marco's comment, I didn't see yours yet at the time.  Since they are 
foked, any memory allocated after the fork would need to be cpu_shared, so it 
makes sense to have the context cpu_shared.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174307840
 
 

 ##
 File path: tests/python/unittest/test_gluon_data.py
 ##
 @@ -112,6 +117,76 @@ def test_multi_worker():
 for i, batch in enumerate(loader):
 assert (batch.asnumpy() == i).all()
 
+@with_seed()
+def test_multi_worker_data_loader():
+class Dummy(Dataset):
+def __init__(self, random_shape):
+self.random_shape = random_shape
+
+def __getitem__(self, idx):
+key = idx
+if self.random_shape:
+out = np.random.uniform(size=(random.randint(1000, 1100), 40))
+labels = np.random.uniform(size=(random.randint(10, 15)))
+else:
+out = np.random.uniform(size=(1000, 40))
+labels = np.random.uniform(size=(10))
+return key, out, labels
+
+def __len__(self):
+return 50
+
+def batchify(self, data):
+"""
+Collate data into batch. Use shared memory for stacking.
+
+:param data: a list of array, with layout of 'NTC'.
+:return either x  and x's unpadded lengths, or x, x's unpadded 
lengths, y and y's unpadded lengths
+if labels are not supplied.
+"""
+
+# input layout is NTC
+keys, inputs, labels = [item[0] for item in data], [item[1] for 
item in data], \
+   [item[2] for item in data]
+
+if len(data) > 1:
+max_data_len = max([seq.shape[0] for seq in inputs])
+max_labels_len = 0 if not labels else max([seq.shape[0] for 
seq in labels])
+else:
+max_data_len = inputs[0].shape[0]
+max_labels_len = 0 if not labels else labels[0].shape[0]
+
+x_lens = [item.shape[0] for item in inputs]
+y_lens = [item.shape[0] for item in labels]
+
+for i, seq in enumerate(inputs):
+pad_len = max_data_len - seq.shape[0]
+inputs[i] = np.pad(seq, ((0, pad_len), (0, 0)), 'constant', 
constant_values=0)
+labels[i] = np.pad(labels[i], (0, max_labels_len - 
labels[i].shape[0]),
+   'constant', constant_values=-1)
+
+inputs = np.asarray(inputs, dtype=np.float32)
+if labels is not None:
+labels = np.asarray(labels, dtype=np.float32)
+inputs = inputs.transpose((1, 0, 2))
+labels = labels.transpose((1, 0))
+
+return (nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
 
 Review comment:
   I meant Marco's comment, I didn't see yours yet at the time.  Since they are 
foked, any memory allocated after the fork would need to be cpu_shared.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174307538
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   Looks like someone tried to fix it once: 
https://patchwork.ozlabs.org/patch/319827/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174307193
 
 

 ##
 File path: tests/python/unittest/test_gluon_data.py
 ##
 @@ -112,6 +117,76 @@ def test_multi_worker():
 for i, batch in enumerate(loader):
 assert (batch.asnumpy() == i).all()
 
+@with_seed()
+def test_multi_worker_data_loader():
+class Dummy(Dataset):
+def __init__(self, random_shape):
+self.random_shape = random_shape
+
+def __getitem__(self, idx):
+key = idx
+if self.random_shape:
+out = np.random.uniform(size=(random.randint(1000, 1100), 40))
+labels = np.random.uniform(size=(random.randint(10, 15)))
+else:
+out = np.random.uniform(size=(1000, 40))
+labels = np.random.uniform(size=(10))
+return key, out, labels
+
+def __len__(self):
+return 50
+
+def batchify(self, data):
+"""
+Collate data into batch. Use shared memory for stacking.
+
+:param data: a list of array, with layout of 'NTC'.
+:return either x  and x's unpadded lengths, or x, x's unpadded 
lengths, y and y's unpadded lengths
+if labels are not supplied.
+"""
+
+# input layout is NTC
+keys, inputs, labels = [item[0] for item in data], [item[1] for 
item in data], \
+   [item[2] for item in data]
+
+if len(data) > 1:
+max_data_len = max([seq.shape[0] for seq in inputs])
+max_labels_len = 0 if not labels else max([seq.shape[0] for 
seq in labels])
+else:
+max_data_len = inputs[0].shape[0]
+max_labels_len = 0 if not labels else labels[0].shape[0]
+
+x_lens = [item.shape[0] for item in inputs]
+y_lens = [item.shape[0] for item in labels]
+
+for i, seq in enumerate(inputs):
+pad_len = max_data_len - seq.shape[0]
+inputs[i] = np.pad(seq, ((0, pad_len), (0, 0)), 'constant', 
constant_values=0)
+labels[i] = np.pad(labels[i], (0, max_labels_len - 
labels[i].shape[0]),
+   'constant', constant_values=-1)
+
+inputs = np.asarray(inputs, dtype=np.float32)
+if labels is not None:
+labels = np.asarray(labels, dtype=np.float32)
+inputs = inputs.transpose((1, 0, 2))
+labels = labels.transpose((1, 0))
+
+return (nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
+nd.array(x_lens, ctx=context.Context('cpu_shared', 0))) \
+if labels is None else (
+nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
+nd.array(x_lens, ctx=context.Context('cpu_shared', 0)),
+nd.array(labels, dtype=labels.dtype, 
ctx=context.Context('cpu_shared', 0)),
+nd.array(y_lens, ctx=context.Context('cpu_shared', 0)))
+
+
+data = Dummy(True)
+loader = DataLoader(data, batch_size=40, batchify_fn=data.batchify, 
num_workers=2)
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174306094
 
 

 ##
 File path: tests/python/unittest/test_gluon_data.py
 ##
 @@ -112,6 +117,76 @@ def test_multi_worker():
 for i, batch in enumerate(loader):
 assert (batch.asnumpy() == i).all()
 
+@with_seed()
+def test_multi_worker_data_loader():
+class Dummy(Dataset):
+def __init__(self, random_shape):
+self.random_shape = random_shape
+
+def __getitem__(self, idx):
+key = idx
+if self.random_shape:
+out = np.random.uniform(size=(random.randint(1000, 1100), 40))
+labels = np.random.uniform(size=(random.randint(10, 15)))
+else:
+out = np.random.uniform(size=(1000, 40))
+labels = np.random.uniform(size=(10))
+return key, out, labels
+
+def __len__(self):
+return 50
+
+def batchify(self, data):
+"""
+Collate data into batch. Use shared memory for stacking.
+
+:param data: a list of array, with layout of 'NTC'.
+:return either x  and x's unpadded lengths, or x, x's unpadded 
lengths, y and y's unpadded lengths
+if labels are not supplied.
+"""
+
+# input layout is NTC
+keys, inputs, labels = [item[0] for item in data], [item[1] for 
item in data], \
+   [item[2] for item in data]
+
+if len(data) > 1:
+max_data_len = max([seq.shape[0] for seq in inputs])
+max_labels_len = 0 if not labels else max([seq.shape[0] for 
seq in labels])
+else:
+max_data_len = inputs[0].shape[0]
+max_labels_len = 0 if not labels else labels[0].shape[0]
+
+x_lens = [item.shape[0] for item in inputs]
+y_lens = [item.shape[0] for item in labels]
+
+for i, seq in enumerate(inputs):
+pad_len = max_data_len - seq.shape[0]
+inputs[i] = np.pad(seq, ((0, pad_len), (0, 0)), 'constant', 
constant_values=0)
+labels[i] = np.pad(labels[i], (0, max_labels_len - 
labels[i].shape[0]),
+   'constant', constant_values=-1)
+
+inputs = np.asarray(inputs, dtype=np.float32)
+if labels is not None:
+labels = np.asarray(labels, dtype=np.float32)
+inputs = inputs.transpose((1, 0, 2))
+labels = labels.transpose((1, 0))
+
+return (nd.array(inputs, dtype=inputs.dtype, 
ctx=context.Context('cpu_shared', 0)),
 
 Review comment:
   Why would you think that?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174304792
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   ```cpp
   /*
   Reset the library so execution in the child starts "all over again" with
   clean data structures in initial states.  Don't worry about freeing 
memory
   allocated by parent, just abandon it to be safe.
   */
   static void
   __kmp_atfork_child (void)
   {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174304476
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   Intel OMP at fork, sets all its init vars to uninitialized, so that it 
recreates its state in the new fork:
   
   https://github.com/pmodels/bolt/blob/master/runtime/src/z_Linux_util.c#L1536


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174304476
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   Intel OMP at fork, sets all its init vars to uninitialized, so that I 
suppose it recreates its state in the new fork:
   
   https://github.com/pmodels/bolt/blob/master/runtime/src/z_Linux_util.c#L1536


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174303574
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   Ha! Nice find! That's exactly it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174300231
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   I think the fork doesn't play nicely with libgomp in this codepath because 
it works fine when linked against mkl, which uses libiomp5.  Or if you build 
with cmake, which builds and links against Intel omp (in 3rdparty), it works 
fine even without MKL enabled.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174300231
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   I think the fork doesn't play nicely with libgomp in this codepath because 
it works fine when linked against mkl, which uses libiomp5.  Or if you build 
with cmake, which builds and links against Intel omp (in 3rdparty), it works 
fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174300231
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   I think the fork doesn't play nicely with libgomp in this codepath because 
it works fine when linked against mkl, which uses libiomp5.  Or if built with 
cmake, which builds and links against Intel omp (in 3rdparty), it works fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174300231
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   I think the fork doesn't play nicely with libgomp in this codepath because 
it works fine when linked against mkl, which uses libiomp.  Or if built with 
cmake, which builds and links against Intel omp (in 3rdparty), it works fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174295804
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   It does get stuck in the kernel launch code for whatever reason.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to pre-profile-changes copy code

2018-03-13 Thread GitBox
cjolivier01 commented on a change in pull request #10090: [MXNET-86] Revert to 
pre-profile-changes copy code
URL: https://github.com/apache/incubator-mxnet/pull/10090#discussion_r174295660
 
 

 ##
 File path: src/ndarray/ndarray_function.cc
 ##
 @@ -26,35 +26,24 @@
 #include "./ndarray_function.h"
 #include "./ndarray_function-inl.h"
 #include "../common/utils.h"
-#include "../operator/mxnet_op.h"
 
 namespace mxnet {
 namespace ndarray {
 template<>
 void Copy(const TBlob , TBlob *to,
 Context from_ctx, Context to_ctx,
 RunContext ctx) {
-  using namespace mxnet::op;
   MSHADOW_TYPE_SWITCH(to->type_flag_, DType, {
 if (to->type_flag_ == from.type_flag_) {
-  TBlob dest = to->FlatTo1D();
-  TBlob src = from.FlatTo1D();
-  const size_t size = src.Size();
-  if (dest.CheckContiguous() && src.CheckContiguous() && size >= 2 /* 
non-trivial size */) {
-CHECK_EQ(dest.shape_, src.shape_)
-  << "Copy:shape mismatch:" << dest.shape_ << " vs " << src.shape_;
-  mxnet_op::Kernel, cpu>::Launch(
-ctx.get_stream(), src.Size(), dest.dptr(), 
src.dptr());
 
 Review comment:
   I don't see why not, but I am just going to revert now because the change 
wasn't worth the trouble it caused.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services