[GitHub] [incubator-mxnet] marcoabreu commented on a change in pull request #16555: Upgrade MKL-DNN dependency to v1.0

2019-10-21 Thread GitBox
marcoabreu commented on a change in pull request #16555: Upgrade MKL-DNN 
dependency to v1.0
URL: https://github.com/apache/incubator-mxnet/pull/16555#discussion_r336997096
 
 

 ##
 File path: tests/python/mkl/test_subgraph.py
 ##
 @@ -727,6 +731,7 @@ def test_pos_conv_add():
 check_fusion(net, data_shape, attrs)
 
 @with_seed()
+@unittest.skip('skip for MKL-DNN 1.0 integration: 
https://github.com/apache/incubator-mxnet/projects/16')
 
 Review comment:
   These will be resolved before merge, right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] marcoabreu commented on a change in pull request #16555: Upgrade MKL-DNN dependency to v1.0

2019-10-21 Thread GitBox
marcoabreu commented on a change in pull request #16555: Upgrade MKL-DNN 
dependency to v1.0
URL: https://github.com/apache/incubator-mxnet/pull/16555#discussion_r336996708
 
 

 ##
 File path: tests/python/unittest/test_operator.py
 ##
 @@ -36,6 +36,14 @@
 import os
 
 def check_rnn_consistency(cell1, cell2, T, N, I, H, grad_req, rtol=1e-2, 
atol=1e-4):
+if default_context().device_type == 'cpu':
+# NOTE(zixuanweeei): Currently, we don't add `add` requests support on 
fused mkl-dnn rnn operator.
 
 Review comment:
   Tracking issue?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #16555: Upgrade MKL-DNN dependency to v1.0

2019-10-21 Thread GitBox
pengzhao-intel commented on issue #16555: Upgrade MKL-DNN dependency to v1.0
URL: https://github.com/apache/incubator-mxnet/pull/16555#issuecomment-544489090
 
 
   @samskalicky @lanking520 @apeforest @szha @eric-haibin-lin @matteosal 
   This is the major version update for MKL-DNN from 0.x to 1.x. We have 
validated internally and got the expected performance and accuracy. 
   Please help to verify your customer cases.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #16563: Python CPU single thread configuration

2019-10-21 Thread GitBox
pengzhao-intel commented on issue #16563: Python CPU single thread configuration
URL: 
https://github.com/apache/incubator-mxnet/issues/16563#issuecomment-544485526
 
 
   @ZhennanQin could help you on the topic :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] JiangZhaoh opened a new pull request #16564: [numpy] add numpy operator : append

2019-10-21 Thread GitBox
JiangZhaoh opened a new pull request #16564: [numpy] add numpy operator : append
URL: https://github.com/apache/incubator-mxnet/pull/16564
 
 
   ## Description ##
   Implementation of numpy append operator in mxnet.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
wkcn commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336965736
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -253,7 +253,16 @@ class DropoutOp {
const TBlob ,
const TBlob ) {
   Stream *s = ctx.get_stream();
-
+  Random *prnd = ctx.requested[1].get_random(s);
+  Tensor workspace =
+ctx.requested[2].get_space_typed(Shape1(1), s);
+  // slice workspace
+  char *workspace_ptr = workspace.dptr_;
+  Tensor random_number =
+Tensor(reinterpret_cast(workspace_ptr),
+ Shape1(1), s);
+  prnd->GetRandInt(random_number);
+  uint64_t seed_ = 17 + reinterpret_cast(_number[0]) % 
4096;
 
 Review comment:
   Sorry that I didn't express it clearly. If different GPUs use the same seed, 
the result of drouout of different GPU should be the same. When training a 
model, do GPUs use different random seed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] deHsien commented on issue #16563: Python CPU single thread configuration

2019-10-21 Thread GitBox
deHsien commented on issue #16563: Python CPU single thread configuration
URL: 
https://github.com/apache/incubator-mxnet/issues/16563#issuecomment-544470746
 
 
   @pengzhao-intel
   May I have your help, thank you


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
roywei commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336958379
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -253,7 +253,16 @@ class DropoutOp {
const TBlob ,
const TBlob ) {
   Stream *s = ctx.get_stream();
-
+  Random *prnd = ctx.requested[1].get_random(s);
+  Tensor workspace =
+ctx.requested[2].get_space_typed(Shape1(1), s);
+  // slice workspace
+  char *workspace_ptr = workspace.dptr_;
+  Tensor random_number =
+Tensor(reinterpret_cast(workspace_ptr),
+ Shape1(1), s);
+  prnd->GetRandInt(random_number);
+  uint64_t seed_ = 17 + reinterpret_cast(_number[0]) % 
4096;
 
 Review comment:
   this will give a segfault during dropout. Also why would dropout on mult-gpu 
return different result? I thought the seed in fixed at global? so dropout on 
different GPU will use the same seed, thus return the same result?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] deHsien opened a new issue #16563: Python CPU single thread configuration

2019-10-21 Thread GitBox
deHsien opened a new issue #16563: Python CPU single thread configuration
URL: https://github.com/apache/incubator-mxnet/issues/16563
 
 
   I'm going to participate in submission limited to use CPU single thread. My 
implementation is in Python and I found that when import mxnet, a threadpool is 
created. After trying for a long time, I can only reach a threadpool with 8 
threads with MXNET_ENGINE_TYPE=NaiveEngine and OMP_NUM_THREADS=1.  
   Is there a configuration to reach a threadpool with 1 thread.  
   Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] fumingxing2015 opened a new issue #16562: Same model but different time-consuming

2019-10-21 Thread GitBox
fumingxing2015 opened a new issue #16562: Same model  but different 
time-consuming
URL: https://github.com/apache/incubator-mxnet/issues/16562
 
 
   I trained a model (mobilenet v1.0) with my data. when  test the 
time-consuming .  I found it's slower than the built-in model in 
gluon.model_zoo,about twice slower. The model have same structure,the only 
different is the params values. I check the param , I found there are same 
values very small . Why the different params can make different time-consuming? 
Should I set some special option to avoid this situation?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] marcoabreu commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
marcoabreu commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336933874
 
 

 ##
 File path: tests/nightly/test_dropout_with_seed.py
 ##
 @@ -0,0 +1,43 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+import numpy as np
+from mxnet.test_utils import assert_almost_equal
+
+
+def test_dropout_with_seed():
 
 Review comment:
   In line 29 you're generating a random number to feed the seed though, so 
that generator needs to be fed with a seed as well


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
roywei commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336932559
 
 

 ##
 File path: tests/nightly/test_dropout_with_seed.py
 ##
 @@ -0,0 +1,43 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+import numpy as np
+from mxnet.test_utils import assert_almost_equal
+
+
+def test_dropout_with_seed():
 
 Review comment:
   I'm manually choosing a random seed and set it before each forward, so 
with_seed decorator will not take effect. See comment: 
https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336370708


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] marcoabreu commented on issue #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
marcoabreu commented on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544439351
 
 
   Ci runs on g3.8xlarge with 2 GPUs


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] marcoabreu commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
marcoabreu commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336927572
 
 

 ##
 File path: tests/nightly/test_dropout_with_seed.py
 ##
 @@ -0,0 +1,43 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+import numpy as np
+from mxnet.test_utils import assert_almost_equal
+
+
+def test_dropout_with_seed():
 
 Review comment:
   Add the with seed annotation


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
wkcn commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336905990
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -262,7 +262,7 @@ class DropoutOp {
 Tensor(reinterpret_cast(workspace_ptr),
  Shape1(1), s);
   prnd->GetRandInt(random_number);
-  uint64_t seed_ = 17 + reinterpret_cast(_number) % 4096;
+  uint64_t seed_ = 17 + reinterpret_cast(random_number.dptr_) % 
4096;
 
 Review comment:
   I think it should be `uint64_t seed_ = 17 + 
static_cast(random_number[0]) % 4096;`, because the type of 
`random_number[0]` is `unsigned`.
   
   
https://github.com/apache/incubator-mxnet/blob/master/3rdparty/mshadow/mshadow/tensor.h#L591


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
wkcn commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336905640
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -253,7 +253,16 @@ class DropoutOp {
const TBlob ,
const TBlob ) {
   Stream *s = ctx.get_stream();
-
+  Random *prnd = ctx.requested[1].get_random(s);
+  Tensor workspace =
+ctx.requested[2].get_space_typed(Shape1(1), s);
+  // slice workspace
+  char *workspace_ptr = workspace.dptr_;
+  Tensor random_number =
+Tensor(reinterpret_cast(workspace_ptr),
+ Shape1(1), s);
+  prnd->GetRandInt(random_number);
+  uint64_t seed_ = 17 + reinterpret_cast(_number[0]) % 
4096;
 
 Review comment:
   I think it should be `uint64_t seed_ = 17 + 
static_cast(random_number[0]) % 4096;`, because the type of 
`random_number[0]` is `unsigned`.
   
   
https://github.com/apache/incubator-mxnet/blob/master/3rdparty/mshadow/mshadow/tensor.h#L591


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #16535: CI Travis time out

2019-10-21 Thread GitBox
roywei commented on issue #16535: CI Travis time out 
URL: 
https://github.com/apache/incubator-mxnet/issues/16535#issuecomment-544410671
 
 
   Update: I disabled Travis directly as we are not testing anything there now.
   We can add it back if we can figure out how to get more resources on Travis. 
according to #13136


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 opened a new pull request #16561: Pickler override for np ndarrays

2019-10-21 Thread GitBox
haojin2 opened a new pull request #16561: Pickler override for np ndarrays
URL: https://github.com/apache/incubator-mxnet/pull/16561
 
 
   ## Description ##
   ...for speeding up data loading under numpy mode.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] Similar pickler override as nd.NDArray for np.ndarray
   - [x] Unit test
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
roywei commented on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544401913
 
 
   > Hi @roywei , could you please add a unit-test for multi-GPU? The result of 
Dropout on multi-GPU should be different.
   > Thanks a lot : )
   
   I believe GPU unit tests are running on instances with 1GPU. I will try to 
move the entire test to nightly tests where it's using P3 instances with 4 
gpus. I can add multi-gpu test there. Hopefully the seed can be properly fixed 
with less parallel jobs on CI workers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
roywei commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336884918
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -253,7 +253,16 @@ class DropoutOp {
const TBlob ,
const TBlob ) {
   Stream *s = ctx.get_stream();
-
+  Random *prnd = ctx.requested[1].get_random(s);
+  Tensor workspace =
+ctx.requested[2].get_space_typed(Shape1(1), s);
+  // slice workspace
+  char *workspace_ptr = workspace.dptr_;
+  Tensor random_number =
+Tensor(reinterpret_cast(workspace_ptr),
+ Shape1(1), s);
+  prnd->GetRandInt(random_number);
+  uint64_t seed_ = 17 + reinterpret_cast(_number[0]) % 
4096;
 
 Review comment:
   I'm just keeping the original logic here:
   
https://github.com/roywei/incubator-mxnet/blob/master/src/operator/nn/dropout-inl.h#L95
   
   and here:
   
https://github.com/roywei/incubator-mxnet/blob/master/src/operator/nn/dropout-inl.h#L491
   
   while fixing seed to respect mxnet seed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
wkcn commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336881255
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -253,7 +253,16 @@ class DropoutOp {
const TBlob ,
const TBlob ) {
   Stream *s = ctx.get_stream();
-
+  Random *prnd = ctx.requested[1].get_random(s);
+  Tensor workspace =
+ctx.requested[2].get_space_typed(Shape1(1), s);
+  // slice workspace
+  char *workspace_ptr = workspace.dptr_;
+  Tensor random_number =
+Tensor(reinterpret_cast(workspace_ptr),
+ Shape1(1), s);
+  prnd->GetRandInt(random_number);
+  uint64_t seed_ = 17 + reinterpret_cast(_number[0]) % 
4096;
 
 Review comment:
   Why does it need the modulus operator `%` ? the modulus operator makes the 
`seed_` between 0+17~4096+17


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #16547: [CD] Adds python docker pipeline

2019-10-21 Thread GitBox
perdasilva commented on issue #16547: [CD] Adds python docker pipeline
URL: https://github.com/apache/incubator-mxnet/pull/16547#issuecomment-544393237
 
 
   @aaronmarkham if you are cool with everything, please merge when you get a 
chance =D


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #16532: fix dropout gpu seed

2019-10-21 Thread GitBox
roywei commented on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544388824
 
 
   I am able to reproduce CI failure locally now by running the following on 
P3.8x with DLAMI
   `ci/build.py --docker-registry mxnetci --nvidiadocker --platform 
ubuntu_gpu_cu101 --docker-build-retries 3 --shm-size 500m 
/work/runtime_functions.sh unittest_ubuntu_python2_gpu`
   
   result:
   ```
   ==
   FAIL: test_operator_gpu.test_dropout_with_seed
   --
   Traceback (most recent call last):
 File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in 
runTest
   self.test(*self.arg)
 File "/usr/local/lib/python2.7/dist-packages/nose/util.py", line 620, in 
newfunc
   return func(*arg, **kw)
 File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in 
test_new
   orig_test(*args, **kwargs)
 File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 
6946, in test_dropout_with_seed
   assert_almost_equal(b.asnumpy(), c.asnumpy())
 File "/work/mxnet/python/mxnet/test_utils.py", line 624, in 
assert_almost_equal
   raise AssertionError(msg)
   AssertionError:
   Items are not equal:
   Error 10002004087734272.00 exceeds tolerance rtol=1.00e-05, 
atol=1.00e-20 (mismatch at least 0.11%).
   Location of maximum error: (0, 1), a=2., b=0.
ACTUAL: array([[0., 2., 2., ..., 2., 0., 0.],
  [0., 2., 2., ..., 0., 0., 2.],
  [2., 2., 2., ..., 0., 0., 2.],...
DESIRED: array([[2., 0., 2., ..., 2., 0., 2.],
  [2., 2., 2., ..., 2., 2., 2.],
  [2., 0., 0., ..., 2., 2., 2.],...
    >> begin captured stdout << -
   
   *** Maximum errors for vector of size 1:  rtol=1e-05, atol=1e-20
   - >> end captured stdout << --
    >> begin captured logging << 
   common: INFO: Setting test np/mx/python random seeds, use 
MXNET_TEST_SEED=179619306 to reproduce.
   - >> end captured logging << -
   
   ```
   
   **However, running the test standalone with the same seed failed with CI 
enviroment passed:**
   ```
   MXNET_TEST_SEED=179619306 nosetests --logging-level=DEBUG --verbose -s  
tests/python/gpu/test_operator_gpu.py:test_dropout_with_seed
   [INFO] Setting module np/mx/python random seeds, use 
MXNET_MODULE_SEED=980748466 to reproduce.
   [WARNING] *** test-level seed set: all "@with_seed()" tests run 
deterministically ***
   test_operator_gpu.test_dropout_with_seed ... [INFO] Setting test 
np/mx/python random seeds, use MXNET_TEST_SEED=179619306 to reproduce.
   [07:36:44] ../src/base.cc:84: Upgrade advisory: this mxnet has been built 
against cuDNN lib version 7401, which is older than the oldest version tested 
by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   --
   Ran 1 test in 13.896s
   
   OK
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] classicsong opened a new issue #16560: It is easy to crash MXNet when tensor goes larger

2019-10-21 Thread GitBox
classicsong opened a new issue #16560: It is easy to crash MXNet when tensor 
goes larger
URL: https://github.com/apache/incubator-mxnet/issues/16560
 
 
   ## Description
   When I use large tensor, it is easy to crash the MXNet kernel.
   Using following python code to reproduce:
   
   ```
   >>> import mxnet.ndarray as nd
   
   >>> a = nd.random.randn(4, 256, 1, 100, 100)
   >>> b = nd.broadcast_axis(a, axis=2, size=256)
   >>> b.size
   262144
   >>> b.asnumpy()
   CRASH HERE
   ```
   The error looks like an int32 overflow on shape.size.
   Any easy way to fix this out? The only way I found out is to compile MXNet 
with USE_INT64_TENSOR_SIZE = ON, which is slower than the default one.
   
   ## Environment info (Required)
   mxnet 1.5.1 (pip3 install)
   
   Package used (Python/R/Scala/Julia):
   Python
   
   ## Error Message:
   ```
   Traceback (most recent call last):
 File "", line 1, in 
 File "/usr/local/lib/python3.5/dist-packages/mxnet/ndarray/ndarray.py", 
line 1996, in asnumpy
   ctypes.c_size_t(data.size)))
 File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 253, in 
check_call
   raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [07:26:09] include/mxnet/././tensor_blob.h:290: Check 
failed: this->shape_.Size() == static_cast(shape.Size()) (262144 
vs. 18446744072036024320) : TBlob.get_with_shape: new and old shape do not 
match total elements
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] JiangZhaoh closed pull request #16549: [numpy] operator test

2019-10-21 Thread GitBox
JiangZhaoh closed pull request #16549: [numpy] operator test
URL: https://github.com/apache/incubator-mxnet/pull/16549
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce opened a new issue #16559: Tracking mxnet.numpy operator issues for 1.6.0 release

2019-10-21 Thread GitBox
reminisce opened a new issue #16559: Tracking mxnet.numpy operator issues for 
1.6.0 release
URL: https://github.com/apache/incubator-mxnet/issues/16559
 
 
   ## Issues from running D2L 
[numpy2](https://github.com/d2l-ai/d2l-en/tree/numpy2) branch
   
   1. chapter_preliminaries/probability.md
   
   ```
   TypeError: no implementation found for 'numpy.broadcast_arrays' on types 
that implement __array_function__: [, ]
   ```
   
   2. chapter_preliminaries/probability.md
   ```
   MXNetError: [09:16:23] src/operator/numpy/np_true_divide.cc:43: Check 
failed: lhs_dtype == rhs_dtype (7 vs. 0) : true_divide currently only supports 
same dtype for dividend and divisor
   ```
   
   3. chapter_multilayer-perceptrons/dropout.md
   ```
   mask = np.random.uniform(0, 1, X.shape) > drop_prob
   return mask * X / (1.0-drop_prob) 
   Fails on the second line there, should be the problem with boolean 
multiplying with float
   ```
   
   4. chapter_deep-learning-computation/parameters.md
   ```
   data[:] = np.random.uniform(-10, 10, data.shape)
   data *= np.abs(data) >= 5
   Fails on the 2nd line, still the multiplication between boolean & float
   ```
   
   5. chapter_recurrent-neural-networks/seq2seq.md
   ```
   MXNetError: [10:46:26] src/operator/numpy/np_true_divide.cc:43: Check 
failed: lhs_dtype == rhs_dtype (0 vs. 6) : true_divide currently only supports 
same dtype for dividend and divisor in 
train_s2s_ch8(model, data_iter, lr, num_epochs, ctx)  24 
metric.add(l.sum(), num_tokens)  25 if epoch % 10 == 0: ---> 26 
animator.add(epoch, (metric[0]/metric[1],))  27 print('loss 
%.3f, %d tokens/sec on %s ' % (  28 metric[0]/metric[1], 
metric[1]/timer.stop(), ctx))
   --
   ```
   
   6. All notebooks with train_s2s_ch8
   ```
   MXNetError: [10:46:26] src/operator/numpy/np_true_divide.cc:43: Check 
failed: lhs_dtype == rhs_dtype (0 vs. 6) : true_divide currently only supports 
same dtype for dividend and divisor
in train_s2s_ch8(model, data_iter, lr, 
num_epochs, ctx)  24 metric.add(l.sum(), num_tokens)  25
 if epoch % 10 == 0: ---> 26 animator.add(epoch, 
(metric[0]/metric[1],))  27 print('loss %.3f, %d tokens/sec on %s ' % ( 
 28 metric[0]/metric[1], metric[1]/timer.stop(), ctx))
   ```
   
   7. chapter_optimization/optimization-intro.md
   ```
   ValueError: mxnet.numpy operator `` 
has not been registered in the _NUMPY_ARRAY_FUNCTION_LIST. Please make sure you 
are using NumPy >= 1.17.0 and the operator implementation is compatible with 
NumPy. Then add the operator name to the list.
   ```
   
   8. chapter_optimization/convexity.md
   ```
   Same as above
   ```
   
   9. chapter_natural-language-processing/sentiment-analysis-rnn.md
   ```
   
---MXNetError
Traceback (most recent call 
last) in *  2* trainer = 
gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': lr})*  3* 
loss = gluon.loss.SoftmaxCrossEntropyLoss()> 4 d2l.train_ch12(net, 
train_iter, test_iter, loss, trainer, num_epochs, ctx)~/d2l-numpy/d2l/d2l.py in 
train_ch12(net, train_iter, test_iter, loss, trainer, num_epochs, ctx_list, 
split_f)*   1054* timer.start()*   1055* l, acc = 
train_batch_ch12(-> 1056 net, features, labels, loss, trainer, 
ctx_list, split_f)*   1057* metric.add(l, acc, labels.shape[0], 
labels.size)*   1058* timer.stop()~/d2l-numpy/d2l/d2l.py in 
train_batch_ch12(net, features, labels, loss, trainer, ctx_list, split_f)*   
1037* l.backward()*   1038* trainer.step(features.shape[0])-> 1039  
   train_loss_sum = sum([float(l.sum()) for l in ls])*   1040* 
train_acc_sum = sum(d2l.accuracy(py, y) for py, y in zip(pys, ys))*   1041* 
return train_loss_sum, train_acc_sum
   ~/d2l-numpy/d2l/d2l.py in (.0)*   1037* l.backward()*   
1038* trainer.step(features.shape[0])-> 1039 train_loss_sum = 
sum([float(l.sum()) for l in ls])*   1040* train_acc_sum = 
sum(d2l.accuracy(py, y) for py, y in zip(pys, ys))*   1041* return 
train_loss_sum, train_acc_sum
   ~/mxnet_master/python/mxnet/numpy/multiarray.py in __float__(self)*791*  
   if num_elements != 1:*792* raise TypeError('only size-1 
arrays can be converted to Python scalars')--> 793 return 
float(self.item())*794**795* def 
__int__(self):~/mxnet_master/python/mxnet/numpy/multiarray.py in item(self, 
*args)*830* """*831* # TODO(junwu): no need to call 
asnumpy() on the whole array.--> 832 return self.asnumpy().item(*args)* 
   833**834* @property
   ~/mxnet_master/python/mxnet/ndarray/ndarray.py in asnumpy(self)*   2517* 
self.handle,*   2518* 
data.ctypes.data_as(ctypes.c_void_p),-> 2519 

[GitHub] [incubator-mxnet] kshitij12345 commented on issue #15909: [numpy] random.rand

2019-10-20 Thread GitBox
kshitij12345 commented on issue #15909: [numpy] random.rand
URL: https://github.com/apache/incubator-mxnet/pull/15909#issuecomment-544349290
 
 
   @reminisce Oh did not notice it. Thank You. Will close this one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] kshitij12345 closed pull request #15909: [numpy] random.rand

2019-10-20 Thread GitBox
kshitij12345 closed pull request #15909: [numpy] random.rand
URL: https://github.com/apache/incubator-mxnet/pull/15909
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha opened a new pull request #16558: [GITHUB] split issue templates

2019-10-20 Thread GitBox
szha opened a new pull request #16558: [GITHUB] split issue templates
URL: https://github.com/apache/incubator-mxnet/pull/16558
 
 
   ## Description ##
   This PR makes use of the multiple issue template feature on GitHub by 
splitting the issue templates into bug report, feature request, and flaky test 
report.
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   
   ### Changes ###
   - [x] split the issue templates into bug report, feature request, and flaky 
test report.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on issue #16508: Distributed Training using MXNet with Horovod

2019-10-20 Thread GitBox
szha commented on issue #16508: Distributed Training using MXNet with Horovod
URL: 
https://github.com/apache/incubator-mxnet/issues/16508#issuecomment-544343011
 
 
   @gentelyang while you're waiting for a response, I'd like to share a 
full-fledged example in GluonNLP for BERT-pretraining with horovod. You can 
find it in [this 
page](http://gluon-nlp.mxnet.io/model_zoo/bert/index.html#run-pre-training)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on issue #16234: [MXNET-1426] Fix the wrong result of sum, mean, argmin, argmax when inputs contain inf or nan

2019-10-20 Thread GitBox
szha commented on issue #16234: [MXNET-1426] Fix the wrong result of sum, mean, 
argmin, argmax when inputs contain inf or nan
URL: https://github.com/apache/incubator-mxnet/pull/16234#issuecomment-544342329
 
 
   cc @reminisce 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on issue #16553: Fix for wrong reqs set after switching from training to inference

2019-10-20 Thread GitBox
szha commented on issue #16553: Fix for wrong reqs set after switching from 
training to inference
URL: https://github.com/apache/incubator-mxnet/pull/16553#issuecomment-544339535
 
 
   sounds good to both


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ptrendx commented on issue #16553: Fix for wrong reqs set after switching from training to inference

2019-10-20 Thread GitBox
ptrendx commented on issue #16553: Fix for wrong reqs set after switching from 
training to inference
URL: https://github.com/apache/incubator-mxnet/pull/16553#issuecomment-544336113
 
 
   For 1: this error was found when working on pointwise fusion, which uses 
reqs to compile the right code. I don't know if there is any other operator 
which may not write to an output if there is no backward pass for it (I know 
`mx.sym.contrib.box_nms` would have that behavior, but the current 
implementation writes to the output only used for backward every time, without 
looking at the req value). I can add the test in pointwise fusion PR, show that 
it fails and then merge with master once this fix is in to show that it does 
not fail anymore.
   
   For 2: I agree, probably as `const static` members of `CachedOp`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on issue #16553: Fix for wrong reqs set after switching from training to inference

2019-10-20 Thread GitBox
szha commented on issue #16553: Fix for wrong reqs set after switching from 
training to inference
URL: https://github.com/apache/incubator-mxnet/pull/16553#issuecomment-544334924
 
 
   As follow-up:
   1. ideally we should be able to test for this explicitly. given that we lack 
the utility for such testing, I want to avoid blocking this fix on this basis
   2. the keys/magic strings should ideally be centralized, maybe in an enum.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] knjwhn opened a new issue #16557: Where is the place that mxnet call cblas_gemm if I use openblas?

2019-10-20 Thread GitBox
knjwhn opened a new issue #16557: Where is the place that mxnet call cblas_gemm 
if I use openblas?
URL: https://github.com/apache/incubator-mxnet/issues/16557
 
 
   Hello,guys,I made some modification in openblas (add a new function 
cblas_mygemm)and I wanna to use it in mxnet , is there have a place or 
something to call cblas_sgemm() in mxnet source code ? Hope for your help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] knjwhn commented on issue #13991: Telling about libopenblas to mxnet config files

2019-10-20 Thread GitBox
knjwhn commented on issue #13991: Telling about libopenblas to mxnet config 
files
URL: 
https://github.com/apache/incubator-mxnet/issues/13991#issuecomment-54480
 
 
   Hello,guys,I made some modification in openblas (add a new cblas_mygemm)and 
I wanna to use it in mxnet , is there have a place or something to call 
cblas_sgemm() in mxnet source code ?  Hope for your help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on a change in pull request #16532: fix dropout gpu seed

2019-10-20 Thread GitBox
szha commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336825080
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -262,7 +262,7 @@ class DropoutOp {
 Tensor(reinterpret_cast(workspace_ptr),
  Shape1(1), s);
   prnd->GetRandInt(random_number);
-  uint64_t seed_ = 17 + reinterpret_cast(_number) % 4096;
+  uint64_t seed_ = 17 + reinterpret_cast(random_number.dptr_) % 
4096;
 
 Review comment:
   dptr_ is a pointer and I don't think you mean to use its address here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei edited a comment on issue #16532: fix dropout gpu seed

2019-10-20 Thread GitBox
roywei edited a comment on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544327630
 
 
   @wkcn Thanks for the review!
   Still investigate why local unit test passed, but CI constantly failed, 
seems seed is not fixed in CI.
   
   On local GPU, running the following passed:
   1. nosetests single test passed 1 times
   2. nosetests all test_operator_gpu passed
   3. runing 
https://github.com/apache/incubator-mxnet/issues/15662#issuecomment-540911324 
directly from python
   
   However, this test failed on CI with both `mx.seed(fixed_seed)` and 
`@with_seed(fixed_seed)` decorator.
   
   I wil try to reproduce CI failure locally first. Or try to add this to 
nightly test where less nosetests are executed at the same time. Suspect other 
nosetest running parallel on CI workers will affect the result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #16532: fix dropout gpu seed

2019-10-20 Thread GitBox
roywei commented on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544327665
 
 
   cc @eric-haibin-lin @sxjscience 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #16532: fix dropout gpu seed

2019-10-20 Thread GitBox
roywei commented on issue #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#issuecomment-544327630
 
 
   Still investigate why local unit test passed, but CI constantly failed, 
seems seed is not fixed in CI.
   
   On local GPU, running the following passed:
   1. nosetests single test passed 1 times
   2. nosetests all test_operator_gpu passed
   3. runing 
https://github.com/apache/incubator-mxnet/issues/15662#issuecomment-540911324 
directly from python
   
   However, this test failed on CI with both `mx.seed(fixed_seed)` and 
`@with_seed(fixed_seed)` decorator.
   
   I wil try to reproduce CI failure locally first. Or try to add this to 
nightly test where less nosetests are executed at the same time. Suspect other 
nosetest running parallel on CI workers will affect the result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on a change in pull request #16532: fix dropout gpu seed

2019-10-20 Thread GitBox
roywei commented on a change in pull request #16532: fix dropout gpu seed
URL: https://github.com/apache/incubator-mxnet/pull/16532#discussion_r336821250
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -262,7 +262,7 @@ class DropoutOp {
 Tensor(reinterpret_cast(workspace_ptr),
  Shape1(1), s);
   prnd->GetRandInt(random_number);
-  uint64_t seed_ = 17 + reinterpret_cast(_number) % 4096;
+  uint64_t seed_ = 17 + reinterpret_cast(random_number.dptr_) % 
4096;
 
 Review comment:
   i get `dropout-inl.h(265): error: invalid type conversion`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on a change in pull request #16207: Bump numpy version >=1.17.0

2019-10-20 Thread GitBox
szha commented on a change in pull request #16207: Bump numpy version >=1.17.0
URL: https://github.com/apache/incubator-mxnet/pull/16207#discussion_r336814314
 
 

 ##
 File path: python/setup.py
 ##
 @@ -30,7 +30,12 @@
 else:
 from setuptools import setup
 from setuptools.extension import Extension
-kwargs = {'install_requires': ['numpy>1.16.0,<2.0.0', 
'requests>=2.20.0,<3', 'graphviz<0.9.0,>=0.8.1'], 'zip_safe': False}
+kwargs = {
+'install_requires': ['requests>=2.20.0,<3', 'graphviz<0.9.0,>=0.8.1']
+.append('numpy>=1.17.0,<2.0.0' if sys.version_info.major >= 3 and 
sys.version_info.minor >= 5
+else 'numpy>1.16.0,<2.0.0'),
 
 Review comment:
   what's the concern for just switching to `>= 1.17`? was it CI?
   setup.py may be evaluated on a platform that builds binary distributions 
that is different from the platform on which the distribution is installed. in 
this scenario, the condition won't be evaluated on the target platform


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] JiangZhaoh opened a new pull request #16556: [numpy]op test in new pattern

2019-10-20 Thread GitBox
JiangZhaoh opened a new pull request #16556: [numpy]op test in new pattern
URL: https://github.com/apache/incubator-mxnet/pull/16556
 
 
   ## Description ##
   Add numpy test:
   - stack
   - var
   - vdot
   - vstack
   - zeros_like
   
   some test who use tvm can not pass CI because environment error:
   - equal
   - not_equal
   - less
   - less_equal
   - greater
   - greater_equal
   
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] szha commented on issue #16531: Just use fp16 is slower than mixed_precision! Why?

2019-10-20 Thread GitBox
szha commented on issue #16531: Just use fp16 is slower than mixed_precision! 
Why?
URL: 
https://github.com/apache/incubator-mxnet/issues/16531#issuecomment-544314191
 
 
   Hi @ZHAIXINGZHAIYUE, please pay attention to the request for your script as 
it's impossible for others to help without knowing your workload. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ZHAIXINGZHAIYUE commented on issue #16531: Just use fp16 is slower than mixed_precision! Why?

2019-10-20 Thread GitBox
ZHAIXINGZHAIYUE commented on issue #16531: Just use fp16 is slower than 
mixed_precision! Why?
URL: 
https://github.com/apache/incubator-mxnet/issues/16531#issuecomment-544313684
 
 
   @anirudh2290 I mean the train speed. mixed_precision: 1000 samples/s, just 
single fp16: 900 samples/s


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] KellenSunderland merged pull request #15399: Add unit tests for TensorRT integration and fix some bugs

2019-10-20 Thread GitBox
KellenSunderland merged pull request #15399: Add unit tests for TensorRT 
integration and fix some bugs
URL: https://github.com/apache/incubator-mxnet/pull/15399
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #16391: cuDNN non-persistant bidirectional RNN dgrad sync fix

2019-10-20 Thread GitBox
haojin2 commented on a change in pull request #16391: cuDNN non-persistant 
bidirectional RNN dgrad sync fix
URL: https://github.com/apache/incubator-mxnet/pull/16391#discussion_r336800317
 
 

 ##
 File path: src/operator/rnn-inl.h
 ##
 @@ -1484,6 +1493,20 @@ class RNNOp {
 }
 #endif  // MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
   }
+
+#if MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
+  // cuDNN versions up to and including v7.6.4 did not sync a last dgrad 
kernel back to the main
+  // cudnn handle's stream (non-persistant algo, bidirectional only).  This 
could result in silent
+  // non-determinstic failures with very low probability, seen more often when 
wgrad is bypassed.
+  inline void SyncDgrad() {
+if (CUDNN_VERSION <= 7604 && dgrad_sync_needed_) {
+  // Without blocking the CPU, create a synchronization point of all 
current GPU activity.  No
+  // need to call cudaStreamWaitEvent- cudaEventRecord on the legacy 
default stream suffices.
+  CUDA_CALL(cudaEventRecord(dgrad_sync_event_, cudaStreamLegacy));
 
 Review comment:
   Hi @DickJC123, I'm encountering `cudaErrorInvalidResourceHandle` error here 
when I'm trying to run this 
[notebook](https://github.com/d2l-ai/d2l-en/blob/numpy2/chapter_natural-language-processing/sentiment-analysis-rnn.md)
 and this 
[notebook](https://github.com/d2l-ai/d2l-en/blob/master/chapter_natural-language-processing/sentiment-analysis-rnn.md)
 in dive into deep learning textbook. Could you help with a fix to that?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #16391: cuDNN non-persistant bidirectional RNN dgrad sync fix

2019-10-20 Thread GitBox
haojin2 commented on a change in pull request #16391: cuDNN non-persistant 
bidirectional RNN dgrad sync fix
URL: https://github.com/apache/incubator-mxnet/pull/16391#discussion_r336800317
 
 

 ##
 File path: src/operator/rnn-inl.h
 ##
 @@ -1484,6 +1493,20 @@ class RNNOp {
 }
 #endif  // MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
   }
+
+#if MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
+  // cuDNN versions up to and including v7.6.4 did not sync a last dgrad 
kernel back to the main
+  // cudnn handle's stream (non-persistant algo, bidirectional only).  This 
could result in silent
+  // non-determinstic failures with very low probability, seen more often when 
wgrad is bypassed.
+  inline void SyncDgrad() {
+if (CUDNN_VERSION <= 7604 && dgrad_sync_needed_) {
+  // Without blocking the CPU, create a synchronization point of all 
current GPU activity.  No
+  // need to call cudaStreamWaitEvent- cudaEventRecord on the legacy 
default stream suffices.
+  CUDA_CALL(cudaEventRecord(dgrad_sync_event_, cudaStreamLegacy));
 
 Review comment:
   Hi @DickJC123, I'm encountering `cudaErrorInvalidResourceHandle` error here 
when I'm trying to run this 
[notebook](https://github.com/d2l-ai/d2l-en/blob/numpy2/chapter_natural-language-processing/sentiment-analysis-rnn.md)
 in dive into deep learning textbook. Could you help with a fix to that?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #16391: cuDNN non-persistant bidirectional RNN dgrad sync fix

2019-10-20 Thread GitBox
haojin2 commented on a change in pull request #16391: cuDNN non-persistant 
bidirectional RNN dgrad sync fix
URL: https://github.com/apache/incubator-mxnet/pull/16391#discussion_r336800317
 
 

 ##
 File path: src/operator/rnn-inl.h
 ##
 @@ -1484,6 +1493,20 @@ class RNNOp {
 }
 #endif  // MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
   }
+
+#if MXNET_USE_CUDNN == 1 && defined(__CUDACC__)
+  // cuDNN versions up to and including v7.6.4 did not sync a last dgrad 
kernel back to the main
+  // cudnn handle's stream (non-persistant algo, bidirectional only).  This 
could result in silent
+  // non-determinstic failures with very low probability, seen more often when 
wgrad is bypassed.
+  inline void SyncDgrad() {
+if (CUDNN_VERSION <= 7604 && dgrad_sync_needed_) {
+  // Without blocking the CPU, create a synchronization point of all 
current GPU activity.  No
+  // need to call cudaStreamWaitEvent- cudaEventRecord on the legacy 
default stream suffices.
+  CUDA_CALL(cudaEventRecord(dgrad_sync_event_, cudaStreamLegacy));
 
 Review comment:
   Hi @DickJC123, I'm encountering `cudaErrorInvalidResourceHandle` error here 
when I'm trying to run this 
(notebook)[https://github.com/d2l-ai/d2l-en/blob/numpy2/chapter_natural-language-processing/sentiment-analysis-rnn.md]
 in dive into deep learning textbook. Could you help with a fix to that?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 merged pull request #16537: Fix numpy bugs

2019-10-20 Thread GitBox
haojin2 merged pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on issue #14415: [Test Failure] Clojure Integration

2019-10-20 Thread GitBox
haojin2 commented on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544285908
 
 
   @gigasquid okay I see the cause. But no hurries on the fix for the test, I 
think @reminisce has also made changes to make that test working now. Thanks a 
lot for your prompt reply! I'll also close the issue now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 closed issue #14415: [Test Failure] Clojure Integration

2019-10-20 Thread GitBox
haojin2 closed issue #14415: [Test Failure] Clojure Integration
URL: https://github.com/apache/incubator-mxnet/issues/14415
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] gigasquid commented on issue #14415: [Test Failure] Clojure Integration

2019-10-20 Thread GitBox
gigasquid commented on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544280438
 
 
   The failing test is a bit too brittle - it is verifying the saved model of 
the mnist. A new attribute has been added so the test is failing `"attrs" 
{"is_np_shape" ["int" 0]}` - This test needs to be reworked, I will add a pr to 
your pr to disable it in the meantime https://github.com/reminisce/mxnet/pull/19


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] gigasquid commented on issue #14415: [Test Failure] Clojure Integration

2019-10-20 Thread GitBox
gigasquid commented on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544275775
 
 
   looking into it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on issue #15909: [numpy] random.rand

2019-10-20 Thread GitBox
reminisce commented on issue #15909: [numpy] random.rand
URL: https://github.com/apache/incubator-mxnet/pull/15909#issuecomment-54427
 
 
   @kshitij12345 Thanks for your contribution. It's merged through 
https://github.com/apache/incubator-mxnet/pull/16554. I added missing 
implementation in `mxnet/ndarray/numpy/random.py`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce merged pull request #16554: [DO NOT SQUASH MERGE] Batch merge PRs

2019-10-20 Thread GitBox
reminisce merged pull request #16554: [DO NOT SQUASH MERGE] Batch merge PRs
URL: https://github.com/apache/incubator-mxnet/pull/16554
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] kshitij12345 edited a comment on issue #13958: Windows builds running out of heap space in CI

2019-10-20 Thread GitBox
kshitij12345 edited a comment on issue #13958: Windows builds running out of 
heap space in CI
URL: 
https://github.com/apache/incubator-mxnet/issues/13958#issuecomment-544235887
 
 
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-cpu/detail/PR-15909/6/
   
   Probably because `broadcast_reduce_op.h` is too big.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] kshitij12345 commented on issue #13958: Windows builds running out of heap space in CI

2019-10-20 Thread GitBox
kshitij12345 commented on issue #13958: Windows builds running out of heap 
space in CI
URL: 
https://github.com/apache/incubator-mxnet/issues/13958#issuecomment-544235887
 
 
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-cpu/detail/PR-15909/6/


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] TaoLv opened a new pull request #16555: Upgrade MKL-DNN dependency to v1.0

2019-10-20 Thread GitBox
TaoLv opened a new pull request #16555: Upgrade MKL-DNN dependency to v1.0
URL: https://github.com/apache/incubator-mxnet/pull/16555
 
 
   ## Description ##
   This is an effort from the whole Intel MXNet team. Credits belong to 
everyone in the team.
   
   This PR upgrades the 3rdparty/mkldnn dependency to it's v1.0.x release which 
has many API breaking changes and is not backward compatible to the previous 
v0.x versions. Hence this PR also changes the integration code in operators 
while keep the integration methodology and software architecture the same. 
Development has been done on the feature branch `mkldnn-v1.0` and tracked via 
github project: https://github.com/apache/incubator-mxnet/projects/16. 
   
   Please see the discussion here: 
https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E
   Please see MKL-DNN v1.0 RFC here: 
https://github.com/intel/mkl-dnn/tree/rfc-api-changes-v1.0/doc/rfc/api-v1.0
   
   As mentioned in the dev@ thread, this PR also removes the mklml and iomp5 
library which previously are distributed along with MXNet pip package. So it 
also fixes the license issue in 
https://github.com/apache/incubator-mxnet/issues/15544. It's a requirement for 
1.6.0 release.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pedro-abundio-wang commented on issue #16527: ErrStr:no kernel image is available for execution on the device

2019-10-20 Thread GitBox
pedro-abundio-wang commented on issue #16527: ErrStr:no kernel image is 
available for execution on the device
URL: 
https://github.com/apache/incubator-mxnet/issues/16527#issuecomment-544230734
 
 
   I am working on my laptop using GeForce GT 750M


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] kshitij12345 commented on issue #15331: [fix] missing input log higher order.

2019-10-20 Thread GitBox
kshitij12345 commented on issue #15331: [fix] missing input log higher order.
URL: https://github.com/apache/incubator-mxnet/pull/15331#issuecomment-544230413
 
 
   @sxjscience @apeforest @larroy Gentle Ping.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] kshitij12345 commented on issue #15909: [numpy] random.rand

2019-10-20 Thread GitBox
kshitij12345 commented on issue #15909: [numpy] random.rand
URL: https://github.com/apache/incubator-mxnet/pull/15909#issuecomment-544229905
 
 
   @reminisce @haojin2 @xidulu Sorry for the delayed action. Please review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] RoeePeleg edited a comment on issue #16427: mxnet.recordio.pack example fails in python3

2019-10-20 Thread GitBox
RoeePeleg edited a comment on issue #16427: mxnet.recordio.pack example fails 
in python3
URL: 
https://github.com/apache/incubator-mxnet/issues/16427#issuecomment-544228137
 
 
   > Use `s = b'...'` @RoeePeleg
   > You can learn the difference of _string_ between python2 and python3 
(encoding).
   
   @tuanzhangCS
   i know i can use bytes instead of str in order to make it work. i just 
wanted to note that the example and the documentation of the method are not 
correct.
   
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/recordio.py#L370-L371
   
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/recordio.py#L378-L386
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] RoeePeleg commented on issue #16427: mxnet.recordio.pack example fails in python3

2019-10-20 Thread GitBox
RoeePeleg commented on issue #16427: mxnet.recordio.pack example fails in 
python3
URL: 
https://github.com/apache/incubator-mxnet/issues/16427#issuecomment-544228137
 
 
   > Use `s = b'...'` @RoeePeleg
   > You can learn the difference of _string_ between python2 and python3 
(encoding).
   
   i know i can use bytes instead of str in order to make it work. i just 
wanted to note that the example and the documentation of the method are not 
correct.
   
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/recordio.py#L370-L371
   
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/recordio.py#L378-L386
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ptrendx commented on issue #16527: ErrStr:no kernel image is available for execution on the device

2019-10-20 Thread GitBox
ptrendx commented on issue #16527: ErrStr:no kernel image is available for 
execution on the device
URL: 
https://github.com/apache/incubator-mxnet/issues/16527#issuecomment-544225295
 
 
   What is the GPU you are using?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce merged pull request #16530: [Numpy] SVD outputs tuple

2019-10-20 Thread GitBox
reminisce merged pull request #16530: [Numpy] SVD outputs tuple
URL: https://github.com/apache/incubator-mxnet/pull/16530
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce opened a new pull request #16554: [DO NOT SQUASH MERGE] Batch merge PRs

2019-10-19 Thread GitBox
reminisce opened a new pull request #16554: [DO NOT SQUASH MERGE] Batch merge 
PRs
URL: https://github.com/apache/incubator-mxnet/pull/16554
 
 
   ## Description ##
   Batch merge PRs that have been reviewed and approved for 1.6 release.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on a change in pull request #16537: Fix numpy bugs

2019-10-19 Thread GitBox
reminisce commented on a change in pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537#discussion_r336763825
 
 

 ##
 File path: tests/python/unittest/test_symbol.py
 ##
 @@ -414,6 +416,47 @@ def test_gen_atomic_symbol_multiple_outputs():
bidirectional=True, state_outputs=True, mode='lstm')
 atomic_sym = s._gen_atomic_symbol()
 
+
+def test_load_save_symbol():
+batch_size = 10
+num_hdidden = 128
+num_features = 784
+
+def get_net():
+data = mx.sym.var('data')
+weight = mx.sym.var('weight', shape=(num_hdidden, 0))
+return mx.sym.FullyConnected(data, weight, num_hidden=num_hdidden)
+
+for flag1 in [False, True]:
 
 Review comment:
   I will rename them in another PR to save the CI cycle for this one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #16537: Fix numpy bugs

2019-10-19 Thread GitBox
haojin2 commented on a change in pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537#discussion_r336763849
 
 

 ##
 File path: tests/python/unittest/test_symbol.py
 ##
 @@ -414,6 +416,47 @@ def test_gen_atomic_symbol_multiple_outputs():
bidirectional=True, state_outputs=True, mode='lstm')
 atomic_sym = s._gen_atomic_symbol()
 
+
+def test_load_save_symbol():
+batch_size = 10
+num_hdidden = 128
+num_features = 784
+
+def get_net():
+data = mx.sym.var('data')
+weight = mx.sym.var('weight', shape=(num_hdidden, 0))
+return mx.sym.FullyConnected(data, weight, num_hidden=num_hdidden)
+
+for flag1 in [False, True]:
 
 Review comment:
   Okay, I'll resolve this one for now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on a change in pull request #16537: Fix numpy bugs

2019-10-19 Thread GitBox
reminisce commented on a change in pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537#discussion_r336763763
 
 

 ##
 File path: tests/python/unittest/test_symbol.py
 ##
 @@ -414,6 +416,47 @@ def test_gen_atomic_symbol_multiple_outputs():
bidirectional=True, state_outputs=True, mode='lstm')
 atomic_sym = s._gen_atomic_symbol()
 
+
+def test_load_save_symbol():
+batch_size = 10
+num_hdidden = 128
+num_features = 784
+
+def get_net():
+data = mx.sym.var('data')
+weight = mx.sym.var('weight', shape=(num_hdidden, 0))
+return mx.sym.FullyConnected(data, weight, num_hidden=num_hdidden)
+
+for flag1 in [False, True]:
 
 Review comment:
   I will rename them in another PR to save the CI cycle for this one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on a change in pull request #16537: Fix numpy bugs

2019-10-19 Thread GitBox
reminisce commented on a change in pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537#discussion_r336763763
 
 

 ##
 File path: tests/python/unittest/test_symbol.py
 ##
 @@ -414,6 +416,47 @@ def test_gen_atomic_symbol_multiple_outputs():
bidirectional=True, state_outputs=True, mode='lstm')
 atomic_sym = s._gen_atomic_symbol()
 
+
+def test_load_save_symbol():
+batch_size = 10
+num_hdidden = 128
+num_features = 784
+
+def get_net():
+data = mx.sym.var('data')
+weight = mx.sym.var('weight', shape=(num_hdidden, 0))
+return mx.sym.FullyConnected(data, weight, num_hidden=num_hdidden)
+
+for flag1 in [False, True]:
 
 Review comment:
   I will rename them in another PR to save the CI cycle for this one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #16537: Fix numpy bugs

2019-10-19 Thread GitBox
haojin2 commented on a change in pull request #16537: Fix numpy bugs
URL: https://github.com/apache/incubator-mxnet/pull/16537#discussion_r336763659
 
 

 ##
 File path: tests/python/unittest/test_symbol.py
 ##
 @@ -414,6 +416,47 @@ def test_gen_atomic_symbol_multiple_outputs():
bidirectional=True, state_outputs=True, mode='lstm')
 atomic_sym = s._gen_atomic_symbol()
 
+
+def test_load_save_symbol():
+batch_size = 10
+num_hdidden = 128
+num_features = 784
+
+def get_net():
+data = mx.sym.var('data')
+weight = mx.sym.var('weight', shape=(num_hdidden, 0))
+return mx.sym.FullyConnected(data, weight, num_hidden=num_hdidden)
+
+for flag1 in [False, True]:
 
 Review comment:
   Maybe we need better names for `flag1` and `flag2`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ptrendx opened a new pull request #16553: Fix for wrong reqs set after switching from training to inference

2019-10-19 Thread GitBox
ptrendx opened a new pull request #16553: Fix for wrong reqs set after 
switching from training to inference
URL: https://github.com/apache/incubator-mxnet/pull/16553
 
 
   ## Description ##
   `StaticAllocMemory` in `CachedOp` used `storage_inplace_index` attribute 
(generated by `MXPlanMemory` graph pass) to assign reqs for edges in the graph. 
However, the `MXPlanMemory` pass is called only once per the type of memory 
plan (full, forward or backward). While the memory plan itself is stored in 
separate attributes in the graph (`forward_mem_plan`, `full_mem_plan` etc.), 
the `storage_inplace_index` (and so the reqs) were overwritten by the last 
called `MXPlanMemory`.
   
   The following code:
   
   ```
   with mx.autograd.record():
  result = net(x)
   result.backward()
   
   result2 = net(x)
   
   with mx.autograd.record():
  result3 = net(x)
   result3.backward()
   ```
   
   calls first the plan memory for the full graph and then for just the forward 
graph. The third invocation to `net` does not invoke a plan memory pass. Let us 
assume that inside `net` is an op that produces the output needed for the 
backward pass only when needed (req not set to `kNullOp`). Since the reqs are 
overwritten by the second `net` invocation, that output's req is set to 
`kNullOp` (because there is no backward pass there). Then the 3rd invocation to 
`net` does not change the req value and so the op does not produces the 
required output - `result3` gradient is therefore produced using the stale 
values and so is wrong.
   
   This PR fixes it by changing the `StaticAllocMemory` to use per mem plan 
values to assign reqs (`storage_inplace_index_forward` etc.) to keep the 
benefits of caching (`MXPlanMemory` called once per type) while ensuring 
correctness.
   
   @eric-haibin-lin 
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] TheJacobKim commented on a change in pull request #16550: Python Docstring Convetion

2019-10-19 Thread GitBox
TheJacobKim commented on a change in pull request #16550: Python Docstring 
Convetion
URL: https://github.com/apache/incubator-mxnet/pull/16550#discussion_r336763069
 
 

 ##
 File path: python/mxnet/log.py
 ##
 @@ -81,7 +81,6 @@ def getLogger(name=None, filename=None, filemode=None, 
level=WARNING):
 """Gets a customized logger.
 
 .. note:: `getLogger` is deprecated. Use `get_logger` instead.
-
 
 Review comment:
   Thanks! I fixed it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on a change in pull request #16549: [numpy] operator test

2019-10-19 Thread GitBox
reminisce commented on a change in pull request #16549: [numpy] operator test
URL: https://github.com/apache/incubator-mxnet/pull/16549#discussion_r336762735
 
 

 ##
 File path: python/mxnet/numpy_dispatch_protocol.py
 ##
 @@ -168,6 +170,10 @@ def _register_array_function():
 _NUMPY_ARRAY_UFUNC_LIST = [
 'abs',
 'add',
+#'equal',
 
 Review comment:
   Please add these names after you fix the test as suggested in the comment 
below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce commented on a change in pull request #16549: [numpy] operator test

2019-10-19 Thread GitBox
reminisce commented on a change in pull request #16549: [numpy] operator test
URL: https://github.com/apache/incubator-mxnet/pull/16549#discussion_r336762727
 
 

 ##
 File path: tests/python/unittest/test_numpy_interoperability.py
 ##
 @@ -825,6 +892,11 @@ def _prepare_workloads():
 _add_workload_arctan2()
 _add_workload_copysign()
 _add_workload_degrees()
+#_add_workload_equal(array_pool)
 
 Review comment:
   You can do this:
   ```python
   if is_op_runnable():
   _add_workload_equal(array_pool)
   ...
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] chenliu0831 removed a comment on issue #13909: gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of data

2019-10-19 Thread GitBox
chenliu0831 removed a comment on issue #13909: 
gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of 
data
URL: 
https://github.com/apache/incubator-mxnet/issues/13909#issuecomment-544216547
 
 
   Should this happen when `last_batch=rollover`? I'm also seeing this error 
for 1.4 even with rollover mode


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce merged pull request #16436: Add sum for boolean type when not built with TVM

2019-10-19 Thread GitBox
reminisce merged pull request #16436: Add sum for boolean type when not built 
with TVM
URL: https://github.com/apache/incubator-mxnet/pull/16436
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ptrendx edited a comment on issue #15589: [Discussion] 1.6.0 Roadmap

2019-10-19 Thread GitBox
ptrendx edited a comment on issue #15589: [Discussion] 1.6.0 Roadmap
URL: 
https://github.com/apache/incubator-mxnet/issues/15589#issuecomment-526373840
 
 
   We have multiple improvements to BERT inference and training speed that we 
would like to be part of 1.6 release:
- [x] Softmax optimizations (#15545 )
- [ ] Pointwise fusion for GPU (#15167 )
- [ ] Eliminate common expressions (#15657 )
- [x] Bias speed improvements (#16039 )
- [x] Aggregated AdamW optimizer (#16398)
- [x] Aggregated zeroing of the gradients (#16446)
- [x] Aggregated sum of squares operator (also used in LARS, #16122)
- [x] Embedding gradient optimization (#16355)
- [ ] Faster multihead attention operator (#16408)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] chenliu0831 edited a comment on issue #13909: gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of data

2019-10-19 Thread GitBox
chenliu0831 edited a comment on issue #13909: 
gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of 
data
URL: 
https://github.com/apache/incubator-mxnet/issues/13909#issuecomment-544216547
 
 
   Should this happen when `last_batch=rollover`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] chenliu0831 edited a comment on issue #13909: gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of data

2019-10-19 Thread GitBox
chenliu0831 edited a comment on issue #13909: 
gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of 
data
URL: 
https://github.com/apache/incubator-mxnet/issues/13909#issuecomment-544216547
 
 
   Should this happen when `last_batch=rollover`? I'm also seeing this error 
for 1.4 even with rollover mode


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] chenliu0831 commented on issue #13909: gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of data

2019-10-19 Thread GitBox
chenliu0831 commented on issue #13909: 
gluon.utils.split_and_load(even_split=False) fails if num of contexts > num of 
data
URL: 
https://github.com/apache/incubator-mxnet/issues/13909#issuecomment-544216547
 
 
   Should this happen when `last_batch= rollover`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 edited a comment on issue #14415: [Test Failure] Clojure Integration

2019-10-19 Thread GitBox
haojin2 edited a comment on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544214694
 
 
   @gigasquid Seems like this test is failing again now but caused by a 
different reason here:
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-16537/11/pipeline
   
   ```
   lein test imclassification.train-mnist-test
   
   Starting Training of MNIST 
   
   Running with context devices of [#object[org.apache.mxnet.Context 0x5e67a490 
cpu(0)]]
   
   [15:44:21] src/io/iter_mnist.cc:110: MNISTIter: load 6 images, 
shuffle=1, shape=(10,784)
   
   [15:44:22] src/io/iter_mnist.cc:110: MNISTIter: load 1 images, 
shuffle=1, shape=(10,784)
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Train-accuracy=0.13231666
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Time cost=7499
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Validation-accuracy=0.338
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Train-accuracy=0.71955
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Time cost=6314
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Validation-accuracy=0.8542
   
   INFO  org.apache.mxnet.module.Module: Saved checkpoint to 
target/test-0002.params
   
   Finish fit
   
   
   
   lein test :only imclassification.train-mnist-test/mnist-two-epochs-test
   
   
   
   FAIL in (mnist-two-epochs-test) (train_mnist_test.clj:38)
   
   expected: (= (file-to-filtered-seq "test/test-symbol.json.ref") 
(file-to-filtered-seq "target/test-symbol.json"))
   
 actual: (not (= ("{" "  \"nodes\": [" "{" "  \"op\": \"null\", " " 
 \"name\": \"data\", " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc1_weight\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc1_bias\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"FullyConnected\", " "  \"name\": \"fc1\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": [[0, 0, 0], [1, 0, 0], [2, 0, 
0]]" "}, " "{" "  \"op\": \"Activation\", " "  \"name\": 
\"relu1\", " "  \"attrs\": {\"act_type\": \"relu\"}, " "  \"inputs\": 
[[3, 0, 0]]" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc2_weight\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc2_bias\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": []" "}, " "{" "  \"op\": \"FullyConnected\", " "  
\"name\": \"fc2\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": [[4, 0, 0], [5, 0, 0], [6, 0, 0]]" "}, " "{" "  \"op\": 
\"Activation\", " "  \"name\": \"relu2\", " "  \"attrs\": 
{\"act_type\": \"relu\"}, " "  \"inputs\": [[7, 0, 0]]" "}, " "{" " 
 \"op\": \"null\", " "  \"name\": \"fc3_weight\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc3_bias\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"FullyConnected\", " "  \"name\": \"fc3\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": [[8, 0, 0], [9, 0, 0], [10, 0, 
0]]" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"softmax_label\", " "  \"inputs\": []" "}, " "{" "  \"op\": 
\"SoftmaxOutput\", " "  \"name\": \"softmax\", " "  \"inputs\": [[11, 
0, 0], [12, 0, 0]]" "}" "  ], " "  \"arg_nodes\": [0, 1, 2, 5, 6, 9, 10, 
12], " "  \"node_row_ptr\": [" "0, " "1, " "2, " "3, " "4, 
" "5, " "6, " "7, " "8, " "9, " "10, " "11, " "
12, " "13, " "14" "  ], " "  \"heads\": [[13, 0, 0]], " "}") ("{" "  
\"nodes\": [" "{" "  \"op\": \"null\", " "  \"name\": \"data\", " " 
 \"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  
\"name\": \"fc1_weight\", " "  \"attrs\": {\"num_hidden\": \"128\"}, " "
  \"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc1_bias\", " "  \"attrs\": {\"num_hidden\": 

[GitHub] [incubator-mxnet] haojin2 commented on issue #14415: [Test Failure] Clojure Integration

2019-10-19 Thread GitBox
haojin2 commented on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544214739
 
 
   @gigasquid Could you help with identifying the cause so that we could fix 
this ASAP? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on issue #14415: [Test Failure] Clojure Integration

2019-10-19 Thread GitBox
haojin2 commented on issue #14415: [Test Failure] Clojure Integration
URL: 
https://github.com/apache/incubator-mxnet/issues/14415#issuecomment-544214694
 
 
   @gigasquid Seems like this test is failing again now but caused by a 
different reason:
   ```
   lein test imclassification.train-mnist-test
   
   Starting Training of MNIST 
   
   Running with context devices of [#object[org.apache.mxnet.Context 0x5e67a490 
cpu(0)]]
   
   [15:44:21] src/io/iter_mnist.cc:110: MNISTIter: load 6 images, 
shuffle=1, shape=(10,784)
   
   [15:44:22] src/io/iter_mnist.cc:110: MNISTIter: load 1 images, 
shuffle=1, shape=(10,784)
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   WARN  org.apache.mxnet.DataDesc: Found Undefined Layout, will use default 
index 0 for batch axis
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Train-accuracy=0.13231666
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Time cost=7499
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[0] Validation-accuracy=0.338
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Train-accuracy=0.71955
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Time cost=6314
   
   INFO  org.apache.mxnet.module.BaseModule: Epoch[1] Validation-accuracy=0.8542
   
   INFO  org.apache.mxnet.module.Module: Saved checkpoint to 
target/test-0002.params
   
   Finish fit
   
   
   
   lein test :only imclassification.train-mnist-test/mnist-two-epochs-test
   
   
   
   FAIL in (mnist-two-epochs-test) (train_mnist_test.clj:38)
   
   expected: (= (file-to-filtered-seq "test/test-symbol.json.ref") 
(file-to-filtered-seq "target/test-symbol.json"))
   
 actual: (not (= ("{" "  \"nodes\": [" "{" "  \"op\": \"null\", " " 
 \"name\": \"data\", " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc1_weight\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc1_bias\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"FullyConnected\", " "  \"name\": \"fc1\", " "  \"attrs\": 
{\"num_hidden\": \"128\"}, " "  \"inputs\": [[0, 0, 0], [1, 0, 0], [2, 0, 
0]]" "}, " "{" "  \"op\": \"Activation\", " "  \"name\": 
\"relu1\", " "  \"attrs\": {\"act_type\": \"relu\"}, " "  \"inputs\": 
[[3, 0, 0]]" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc2_weight\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc2_bias\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": []" "}, " "{" "  \"op\": \"FullyConnected\", " "  
\"name\": \"fc2\", " "  \"attrs\": {\"num_hidden\": \"64\"}, " "  
\"inputs\": [[4, 0, 0], [5, 0, 0], [6, 0, 0]]" "}, " "{" "  \"op\": 
\"Activation\", " "  \"name\": \"relu2\", " "  \"attrs\": 
{\"act_type\": \"relu\"}, " "  \"inputs\": [[7, 0, 0]]" "}, " "{" " 
 \"op\": \"null\", " "  \"name\": \"fc3_weight\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"null\", " "  \"name\": \"fc3_bias\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": []" "}, " "{" "  
\"op\": \"FullyConnected\", " "  \"name\": \"fc3\", " "  \"attrs\": 
{\"num_hidden\": \"10\"}, " "  \"inputs\": [[8, 0, 0], [9, 0, 0], [10, 0, 
0]]" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"softmax_label\", " "  \"inputs\": []" "}, " "{" "  \"op\": 
\"SoftmaxOutput\", " "  \"name\": \"softmax\", " "  \"inputs\": [[11, 
0, 0], [12, 0, 0]]" "}" "  ], " "  \"arg_nodes\": [0, 1, 2, 5, 6, 9, 10, 
12], " "  \"node_row_ptr\": [" "0, " "1, " "2, " "3, " "4, 
" "5, " "6, " "7, " "8, " "9, " "10, " "11, " "
12, " "13, " "14" "  ], " "  \"heads\": [[13, 0, 0]], " "}") ("{" "  
\"nodes\": [" "{" "  \"op\": \"null\", " "  \"name\": \"data\", " " 
 \"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  
\"name\": \"fc1_weight\", " "  \"attrs\": {\"num_hidden\": \"128\"}, " "
  \"inputs\": []" "}, " "{" "  \"op\": \"null\", " "  \"name\": 
\"fc1_bias\", " "  \"attrs\": {\"num_hidden\": \"128\"}, " "  
\"inputs\": []" "}, " "{" "  \"op\": \"FullyConnected\", " "  
\"name\": \"fc1\", " "  \"attrs\": 

[GitHub] [incubator-mxnet] perdasilva opened a new issue #14415: [Test Failure] Clojure Integration

2019-10-19 Thread GitBox
perdasilva opened a new issue #14415: [Test Failure] Clojure Integration
URL: https://github.com/apache/incubator-mxnet/issues/14415
 
 
   ## Description
   
   Seems the scala package tests are failing the Clojure Integration tests on CI
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/master/405/pipeline
   
   Seems related to #14402
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel merged pull request #16551: [mkldnn-v1.0] Change MXNET_USE_MKLDNN from 100 back to 1

2019-10-19 Thread GitBox
pengzhao-intel merged pull request #16551: [mkldnn-v1.0] Change 
MXNET_USE_MKLDNN from 100 back to 1
URL: https://github.com/apache/incubator-mxnet/pull/16551
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #16551: [mkldnn-v1.0] Change MXNET_USE_MKLDNN from 100 back to 1

2019-10-19 Thread GitBox
pengzhao-intel commented on issue #16551: [mkldnn-v1.0] Change MXNET_USE_MKLDNN 
from 100 back to 1
URL: https://github.com/apache/incubator-mxnet/pull/16551#issuecomment-544212976
 
 
   > What does this do?
   
   This is only a trick when we upgrade the MKL-DNN to 1.0 version. Now all 
code is changed into the new version so we switch the number back to normal.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] reminisce merged pull request #16493: Tests of interoperability of numpy dispatch

2019-10-19 Thread GitBox
reminisce merged pull request #16493: Tests of interoperability of numpy 
dispatch
URL: https://github.com/apache/incubator-mxnet/pull/16493
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sxjscience merged pull request #16465: [Gluon] [Fix] Fix HybridBlock when hybridize is not called

2019-10-19 Thread GitBox
sxjscience merged pull request #16465: [Gluon] [Fix] Fix HybridBlock when 
hybridize is not called
URL: https://github.com/apache/incubator-mxnet/pull/16465
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] access2rohit commented on issue #16497: Large Vector tests for DGL Ops Part 2

2019-10-19 Thread GitBox
access2rohit commented on issue #16497: Large Vector tests for DGL Ops Part 2
URL: https://github.com/apache/incubator-mxnet/pull/16497#issuecomment-544207768
 
 
   @anirudh2290 can we merge this PR now ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] eric-haibin-lin edited a comment on issue #16398: Aggregated adamw update

2019-10-19 Thread GitBox
eric-haibin-lin edited a comment on issue #16398: Aggregated adamw update
URL: https://github.com/apache/incubator-mxnet/pull/16398#issuecomment-544206811
 
 
   Thank you @drivanov @ptrendx 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] eric-haibin-lin merged pull request #16398: Aggregated adamw update

2019-10-19 Thread GitBox
eric-haibin-lin merged pull request #16398: Aggregated adamw update
URL: https://github.com/apache/incubator-mxnet/pull/16398
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #16398: Aggregated adamw update

2019-10-19 Thread GitBox
eric-haibin-lin commented on issue #16398: Aggregated adamw update
URL: https://github.com/apache/incubator-mxnet/pull/16398#issuecomment-544206811
 
 
   Thank you @drivanov 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] access2rohit commented on issue #16416: Dgl ops 2

2019-10-19 Thread GitBox
access2rohit commented on issue #16416: Dgl ops 2
URL: https://github.com/apache/incubator-mxnet/pull/16416#issuecomment-544205148
 
 
   @ChaiBapchya can you restart sanity test on your PR 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] zhreshold commented on a change in pull request #16542: [WIP] Faster GPU NMS optimization

2019-10-19 Thread GitBox
zhreshold commented on a change in pull request #16542: [WIP] Faster GPU NMS 
optimization
URL: https://github.com/apache/incubator-mxnet/pull/16542#discussion_r336755641
 
 

 ##
 File path: src/operator/contrib/bounding_box.cu
 ##
 @@ -24,14 +24,701 @@
   * \author Joshua Zhang
   */
 
+#include 
+
 #include "./bounding_box-inl.cuh"
 #include "./bounding_box-inl.h"
 #include "../elemwise_op_common.h"
 
 namespace mxnet {
 namespace op {
+
+namespace {
+
+using mshadow::Tensor;
+using mshadow::Stream;
+
+template 
+struct TempWorkspace {
+  index_t scores_temp_space;
+  DType* scores;
+  index_t scratch_space;
+  uint8_t* scratch;
+  index_t buffer_space;
+  DType* buffer;
+  index_t nms_scratch_space;
+  uint32_t* nms_scratch;
+  index_t indices_temp_spaces;
+  index_t* indices;
+};
+
+inline index_t ceil_div(index_t x, index_t y) {
+  return (x + y - 1) / y;
+}
+
+inline index_t align(index_t x, index_t alignment) {
+  return ceil_div(x, alignment)  * alignment;
+}
+
+template 
+__global__ void FilterAndPrepareAuxData_kernel(const DType* data, DType* out, 
DType* scores,
+   index_t num_elements_per_batch,
+   const index_t element_width,
+   const index_t N,
+   const float threshold,
+   const int id_index, const int 
score_index,
+   const int background_id) {
+  index_t tid = blockIdx.x * blockDim.x + threadIdx.x;
+  bool first_in_element = (tid % element_width == 0);
+  index_t start_of_my_element = tid - (tid % element_width);
+
+  if (tid < N) {
+DType my_score = data[start_of_my_element + score_index];
+bool filtered_out = my_score <= threshold;
+if (id_index != -1 && background_id != -1) {
+  DType my_id = data[start_of_my_element + id_index];
+  filtered_out = filtered_out || (my_id == background_id);
+}
+if (!filtered_out) {
+  out[tid] = data[tid];
+} else {
+  out[tid] = -1;
+  my_score = -1;
+}
+
+if (first_in_element) {
+  index_t offset = tid / element_width;
+  scores[offset] = my_score;
+}
+  }
+}
+
+template 
+void FilterAndPrepareAuxData(const Tensor& data,
+ Tensor* out,
+ const TempWorkspace& workspace,
+ const BoxNMSParam& param,
+ Stream* s) {
+  const int n_threads = 512;
+  index_t N = data.shape_.Size();
+  const auto blocks = ceil_div(N, n_threads);
+  FilterAndPrepareAuxData_kernel<<::GetStream(s)>>>(
+data.dptr_, out->dptr_, workspace.scores,
+data.shape_[1], data.shape_[2], N,
+param.valid_thresh, param.id_index,
+param.score_index, param.background_id);
+}
+
+template 
+__global__ void CompactData_kernel(const index_t* indices, const DType* source,
 
 Review comment:
   Mixed naming convention, not sure if cpplint complains about it.
   Same for lot's of kernel functions in the rest of this file


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #16547: [CD] Adds python docker pipeline

2019-10-19 Thread GitBox
perdasilva commented on issue #16547: [CD] Adds python docker pipeline
URL: https://github.com/apache/incubator-mxnet/pull/16547#issuecomment-544197606
 
 
   Okay, [looks 
good](http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job/detail/mxnet-cd-release-job/320/pipeline)
 now. I'm.glad you took the time to have a look! Thank you!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] perdasilva commented on issue #16547: [CD] Adds python docker pipeline

2019-10-19 Thread GitBox
perdasilva commented on issue #16547: [CD] Adds python docker pipeline
URL: https://github.com/apache/incubator-mxnet/pull/16547#issuecomment-544195090
 
 
   @aaronmarkham nope, that's bad. I've rotated the credentials. I've checked 
the DockerHub account and nothing has changed there. Thanks for catching this! 
I've updated the logging configuration and I'm running a job to double check.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] aaronmarkham commented on issue #16547: [CD] Adds python docker pipeline

2019-10-19 Thread GitBox
aaronmarkham commented on issue #16547: [CD] Adds python docker pipeline
URL: https://github.com/apache/incubator-mxnet/pull/16547#issuecomment-544190021
 
 
   Just notice this:
   `"SecretString":"{\\"username\\": \\"mxnetcddev\\", \\"password\\": 
\\"uuz:C^P;2lOR3Z:mNi|:8A6^}UUKtq9F\\"}"`
   And the security token is output in the log too... maybe they're one time 
use? Or?...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] aaronmarkham commented on issue #16512: detect number of procs during sphinx build

2019-10-19 Thread GitBox
aaronmarkham commented on issue #16512: detect number of procs during sphinx 
build
URL: https://github.com/apache/incubator-mxnet/pull/16512#issuecomment-544178178
 
 
   Kind of ridiculous how many times I've had to restart the tests on this PR.
   
   Restarting centos-gpu now due to failing here: 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-16512/4/pipeline


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   4   5   6   7   8   9   10   >