[GitHub] haojin2 commented on issue #10981: build ends with error : elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before ‘template’

2018-05-16 Thread GitBox
haojin2 commented on issue #10981: build ends with error : 
elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before 
‘template’
URL: 
https://github.com/apache/incubator-mxnet/issues/10981#issuecomment-389749310
 
 
   @phoenixbai A quick solution to this is to upgrade your CUDA version to 8.0 
or later if possible. I'm currently working on a fix for this but that may take 
longer than a simple upgrade.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #10981: build ends with error : elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before ‘template’

2018-05-16 Thread GitBox
eric-haibin-lin commented on issue #10981: build ends with error : 
elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before 
‘template’
URL: 
https://github.com/apache/incubator-mxnet/issues/10981#issuecomment-389748559
 
 
   @haojin2 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188841316
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,594 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+// The input arguments are ordered in the following order:
+// in, state0, state1, ...
+// We need to reorder them in the same order as the input nodes of the 
subgraph.
+template
+static std::vector ReorderInputs(const std::vector , const 
nnvm::IndexedGraph& idx) {
+  std::vector ret(in.size());
+  CHECK_EQ(idx.input_nodes().size(), in.size());
+  for (size_t i = 0; i < idx.input_nodes().size(); i++) {
+std::string name = idx[idx.input_nodes()[i]].source->attrs.name;
+if (name == "in") {
+  ret[i] = in[0];
+} else {
+  auto idx_str = name.substr(5);
+  int idx = std::stoi(idx_str);
+  ret[i] = in[idx + 1];
+}
+  }
+  return ret;
+}
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
 
 Review comment:
   Why copy `std::vector`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188508037
 
 

 ##
 File path: src/executor/exec_pass.h
 ##
 @@ -64,6 +64,9 @@ class OpExecutor {
   OpContext op_ctx;
   /*! \brief virtual destructor */
   virtual ~OpExecutor() {}
+  virtual bool HasSubgraph() const {
 
 Review comment:
   Make it pure virtual?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188505156
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
 
 Review comment:
   Where is `prefix` used?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188782087
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,594 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+// The input arguments are ordered in the following order:
+// in, state0, state1, ...
+// We need to reorder them in the same order as the input nodes of the 
subgraph.
+template
+static std::vector ReorderInputs(const std::vector , const 
nnvm::IndexedGraph& idx) {
+  std::vector ret(in.size());
+  CHECK_EQ(idx.input_nodes().size(), in.size());
+  for (size_t i = 0; i < idx.input_nodes().size(); i++) {
+std::string name = idx[idx.input_nodes()[i]].source->attrs.name;
+if (name == "in") {
+  ret[i] = in[0];
+} else {
+  auto idx_str = name.substr(5);
+  int idx = std::stoi(idx_str);
+  ret[i] = in[idx + 1];
+}
+  }
+  return ret;
+}
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
 
 Review comment:
   Why copy `std::vector`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188511608
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,594 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+// The input arguments are ordered in the following order:
+// in, state0, state1, ...
+// We need to reorder them in the same order as the input nodes of the 
subgraph.
+template
+static std::vector ReorderInputs(const std::vector , const 
nnvm::IndexedGraph& idx) {
+  std::vector ret(in.size());
+  CHECK_EQ(idx.input_nodes().size(), in.size());
+  for (size_t i = 0; i < idx.input_nodes().size(); i++) {
+std::string name = idx[idx.input_nodes()[i]].source->attrs.name;
+if (name == "in") {
+  ret[i] = in[0];
+} else {
+  auto idx_str = name.substr(5);
+  int idx = std::stoi(idx_str);
+  ret[i] = in[idx + 1];
+}
+  }
+  return ret;
+}
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, 

[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188507490
 
 

 ##
 File path: src/c_api/c_api_symbolic.cc
 ##
 @@ -344,6 +345,33 @@ int MXSymbolGetAtomicSymbolName(AtomicSymbolCreator 
creator,
   API_END();
 }
 
+int MXSymbolGetInputSymbols(SymbolHandle sym, SymbolHandle **out_arr, int 
*out_size) {
+  API_BEGIN();
+  nnvm::Symbol *s = static_cast(sym);
+  nnvm::Graph g;
+  g.outputs = s->outputs;
+  std::vector input_syms;
+  const nnvm::IndexedGraph& idx = g.indexed_graph();
+  size_t max_out_size = *out_size;
+  // Go through all nodes and return the ones representing variables.
+  for (size_t i = 0; i < idx.num_nodes(); i++) {
+const nnvm::Node  = *idx[i].source;
+for (const nnvm::NodeEntry  : n.inputs) {
+  auto p = e.node;
+  if (p->is_variable()) {
+nnvm::Symbol *s = new nnvm::Symbol();
+s->outputs.push_back(e);
+input_syms.push_back(s);
+std::cout << p->attrs.name << std::endl;
+  }
+}
+  }
+  CHECK(input_syms.size() <= max_out_size);
+  *out_size = input_syms.size();
+  memcpy(out_arr, input_syms.data(), sizeof(*out_arr) * input_syms.size());
+  API_END();
 
 Review comment:
   Use `API_END_HANDLE_ERROR` to cleanup `input_syms` if an exception is thrown 
to prevent memory leak.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188506820
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles,
+ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+with AttrScope(subgraph_name=name):
+if isinstance(input, list):
+in_eles = [symbol.var(sym.name) for sym in input]
+else:
+in_eles = symbol.var(input.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
+assert len(sym_out[1]) == len(init_states), \
+"the number of output states (%d) should be the same as input 
states (%d)" \
+% (len(sym_out[1]), len(init_states))
+
+if (isinstance(sym_out[0], list)):
+flat_out = sym_out[0]
+else:
+flat_out = [sym_out[0]]
+num_out_data = len(flat_out)
+for s in sym_out[1]:
+# There is a problem if the outputs are the same as the inputs
+# or the first output.
+# TODO this is a temp fix.
+flat_out.append(symbol.op.identity(s))
+g = symbol.Group(flat_out)
+input_syms = _get_graph_inputs(g, name, "ro_var")
+
+if (isinstance(input, list)):
+num_inputs = len(input)
+else:
+num_inputs = 1
+
+# Here we need to find out how the input symbols are ordered as well as
+# where the loop states are located in the list of inputs.
+
+# This dict contains the symbols of the subgraph.
+input_syms = {sym.name:sym for sym in input_syms}
+gin_names = input_syms.keys()
+# This array contains the symbols for the inputs of foreach.
+# They are ordered according to the inputs of the subgraph.
+ordered_ins = []
+states_map = {sym.name:sym for sym in init_states}
+state_names = states_map.keys()
+data_syms = _as_list(input)
+data_map = {sym.name:sym for sym in data_syms}
+data_names = data_map.keys()
+in_state_locs = []
+in_data_locs = []
+for in_name in g.list_inputs():
+assert in_name in gin_names, "The input variable %s can't be found in 
graph inputs: %s" \
+% (in_name, str(gin_names))
+if (in_name in state_names):
+ordered_ins.append(states_map[in_name])
+in_state_locs.append(len(ordered_ins) - 1)
+elif (in_name in data_names):
+ordered_ins.append(data_map[in_name])
+in_data_locs.append(len(ordered_ins) - 1)
+else:
+ordered_ins.append(input_syms[in_name])
+
+num_outputs = len(flat_out)
+num_states = len(state_names)
+ret = symbol._internal._foreach(g, *ordered_ins, num_outputs=num_outputs,
+num_out_data=num_out_data, 
in_state_locs=in_state_locs,
+in_data_locs=in_data_locs)
+if (num_outputs - num_states > 1):
 
 Review comment:
   No parentheses.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188507205
 
 

 ##
 File path: src/c_api/c_api_symbolic.cc
 ##
 @@ -344,6 +345,33 @@ int MXSymbolGetAtomicSymbolName(AtomicSymbolCreator 
creator,
   API_END();
 }
 
+int MXSymbolGetInputSymbols(SymbolHandle sym, SymbolHandle **out_arr, int 
*out_size) {
+  API_BEGIN();
+  nnvm::Symbol *s = static_cast(sym);
+  nnvm::Graph g;
+  g.outputs = s->outputs;
+  std::vector input_syms;
+  const nnvm::IndexedGraph& idx = g.indexed_graph();
+  size_t max_out_size = *out_size;
+  // Go through all nodes and return the ones representing variables.
+  for (size_t i = 0; i < idx.num_nodes(); i++) {
+const nnvm::Node  = *idx[i].source;
+for (const nnvm::NodeEntry  : n.inputs) {
+  auto p = e.node;
+  if (p->is_variable()) {
+nnvm::Symbol *s = new nnvm::Symbol();
+s->outputs.push_back(e);
+input_syms.push_back(s);
+std::cout << p->attrs.name << std::endl;
 
 Review comment:
   Remove this line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188504970
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles,
+ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+with AttrScope(subgraph_name=name):
+if isinstance(input, list):
+in_eles = [symbol.var(sym.name) for sym in input]
+else:
+in_eles = symbol.var(input.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
+assert len(sym_out[1]) == len(init_states), \
+"the number of output states (%d) should be the same as input 
states (%d)" \
+% (len(sym_out[1]), len(init_states))
+
+if (isinstance(sym_out[0], list)):
 
 Review comment:
   No parentheses needed. It would result in coding style error in PyCharm.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188838801
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,594 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+// The input arguments are ordered in the following order:
+// in, state0, state1, ...
+// We need to reorder them in the same order as the input nodes of the 
subgraph.
+template
+static std::vector ReorderInputs(const std::vector , const 
nnvm::IndexedGraph& idx) {
+  std::vector ret(in.size());
+  CHECK_EQ(idx.input_nodes().size(), in.size());
+  for (size_t i = 0; i < idx.input_nodes().size(); i++) {
+std::string name = idx[idx.input_nodes()[i]].source->attrs.name;
+if (name == "in") {
+  ret[i] = in[0];
+} else {
+  auto idx_str = name.substr(5);
+  int idx = std::stoi(idx_str);
+  ret[i] = in[idx + 1];
+}
+  }
+  return ret;
+}
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, 

[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188506678
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles,
+ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+with AttrScope(subgraph_name=name):
+if isinstance(input, list):
+in_eles = [symbol.var(sym.name) for sym in input]
+else:
+in_eles = symbol.var(input.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
+assert len(sym_out[1]) == len(init_states), \
+"the number of output states (%d) should be the same as input 
states (%d)" \
+% (len(sym_out[1]), len(init_states))
+
+if (isinstance(sym_out[0], list)):
+flat_out = sym_out[0]
+else:
+flat_out = [sym_out[0]]
+num_out_data = len(flat_out)
+for s in sym_out[1]:
+# There is a problem if the outputs are the same as the inputs
+# or the first output.
+# TODO this is a temp fix.
+flat_out.append(symbol.op.identity(s))
+g = symbol.Group(flat_out)
+input_syms = _get_graph_inputs(g, name, "ro_var")
+
+if (isinstance(input, list)):
+num_inputs = len(input)
+else:
+num_inputs = 1
+
+# Here we need to find out how the input symbols are ordered as well as
+# where the loop states are located in the list of inputs.
+
+# This dict contains the symbols of the subgraph.
+input_syms = {sym.name:sym for sym in input_syms}
+gin_names = input_syms.keys()
+# This array contains the symbols for the inputs of foreach.
+# They are ordered according to the inputs of the subgraph.
+ordered_ins = []
+states_map = {sym.name:sym for sym in init_states}
+state_names = states_map.keys()
+data_syms = _as_list(input)
+data_map = {sym.name:sym for sym in data_syms}
+data_names = data_map.keys()
+in_state_locs = []
+in_data_locs = []
+for in_name in g.list_inputs():
+assert in_name in gin_names, "The input variable %s can't be found in 
graph inputs: %s" \
+% (in_name, str(gin_names))
+if (in_name in state_names):
 
 Review comment:
   No parentheses.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188506709
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles,
+ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+with AttrScope(subgraph_name=name):
+if isinstance(input, list):
+in_eles = [symbol.var(sym.name) for sym in input]
+else:
+in_eles = symbol.var(input.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
+assert len(sym_out[1]) == len(init_states), \
+"the number of output states (%d) should be the same as input 
states (%d)" \
+% (len(sym_out[1]), len(init_states))
+
+if (isinstance(sym_out[0], list)):
+flat_out = sym_out[0]
+else:
+flat_out = [sym_out[0]]
+num_out_data = len(flat_out)
+for s in sym_out[1]:
+# There is a problem if the outputs are the same as the inputs
+# or the first output.
+# TODO this is a temp fix.
+flat_out.append(symbol.op.identity(s))
+g = symbol.Group(flat_out)
+input_syms = _get_graph_inputs(g, name, "ro_var")
+
+if (isinstance(input, list)):
+num_inputs = len(input)
+else:
+num_inputs = 1
+
+# Here we need to find out how the input symbols are ordered as well as
+# where the loop states are located in the list of inputs.
+
+# This dict contains the symbols of the subgraph.
+input_syms = {sym.name:sym for sym in input_syms}
+gin_names = input_syms.keys()
+# This array contains the symbols for the inputs of foreach.
+# They are ordered according to the inputs of the subgraph.
+ordered_ins = []
+states_map = {sym.name:sym for sym in init_states}
+state_names = states_map.keys()
+data_syms = _as_list(input)
+data_map = {sym.name:sym for sym in data_syms}
+data_names = data_map.keys()
+in_state_locs = []
+in_data_locs = []
+for in_name in g.list_inputs():
+assert in_name in gin_names, "The input variable %s can't be found in 
graph inputs: %s" \
+% (in_name, str(gin_names))
+if (in_name in state_names):
+ordered_ins.append(states_map[in_name])
+in_state_locs.append(len(ordered_ins) - 1)
+elif (in_name in data_names):
 
 Review comment:
   Same here. No parentheses.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188505538
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,99 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg, name, prefix):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles,
+ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
 
 Review comment:
   `input` is a keyword in python. Does it make sense to call it `data` which 
can be both singular and plural?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10451: [WIP] Add Foreach

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10451: [WIP] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r188840339
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,594 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+// The input arguments are ordered in the following order:
+// in, state0, state1, ...
+// We need to reorder them in the same order as the input nodes of the 
subgraph.
+template
+static std::vector ReorderInputs(const std::vector , const 
nnvm::IndexedGraph& idx) {
 
 Review comment:
   Where is this used?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin opened a new pull request #10983: [MXNET-427] Fix trainer.load_state by removing param_dict from optimizer state pickle

2018-05-16 Thread GitBox
eric-haibin-lin opened a new pull request #10983: [MXNET-427] Fix 
trainer.load_state by removing param_dict from optimizer state pickle
URL: https://github.com/apache/incubator-mxnet/pull/10983
 
 
   ## Description ##
   See JIRA item for more details. @piiswrong @szha please review.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] pengzhao-intel commented on issue #10933: remove unnecessary checks on convolution parameters

2018-05-16 Thread GitBox
pengzhao-intel commented on issue #10933: remove unnecessary checks on 
convolution parameters
URL: https://github.com/apache/incubator-mxnet/pull/10933#issuecomment-389746651
 
 
   @zheng-da @piiswrong tiny change to remove unnecessary code, please help 
take a review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha opened a new pull request #10982: fix rnn layer kernel forward

2018-05-16 Thread GitBox
szha opened a new pull request #10982: fix rnn layer kernel forward
URL: https://github.com/apache/incubator-mxnet/pull/10982
 
 
   ## Description ##
   fix rnn layer condition for kernel forward.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] fix rnn layer condition for kernel forward


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] chinakook commented on issue #10972: Gluon's PReLU is very slow and a fix to it

2018-05-16 Thread GitBox
chinakook commented on issue #10972: Gluon's PReLU is very slow and a fix to it
URL: 
https://github.com/apache/incubator-mxnet/issues/10972#issuecomment-389741834
 
 
   I think it's no need to broadcast when multiply a scalar and a matrix. May 
some other operation is more suitable for this kind of multiplication.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on issue #10979: Fix bugs in MKLDNN.

2018-05-16 Thread GitBox
ThomasDelteil commented on issue #10979: Fix bugs in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/10979#issuecomment-389741368
 
 
   :+1: for the descriptive title and PR description


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #9686: APIs that might be a good idea to break in 2.0

2018-05-16 Thread GitBox
szha commented on issue #9686: APIs that might be a good idea to break in 2.0
URL: 
https://github.com/apache/incubator-mxnet/issues/9686#issuecomment-389740264
 
 
   contrib.ctc_loss should make into supported operator.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yifeim commented on a change in pull request #10946: Remove kvstore calls from FM example

2018-05-16 Thread GitBox
yifeim commented on a change in pull request #10946: Remove kvstore calls from 
FM example
URL: https://github.com/apache/incubator-mxnet/pull/10946#discussion_r188834985
 
 

 ##
 File path: example/sparse/factorization_machine/train.py
 ##
 @@ -75,6 +76,16 @@
 assert(args.data_train is not None and args.data_test is not None), \
   "dataset for training or test is missing"
 
+def batch_row_ids(data_batch):
+""" Generate row ids based on the current mini-batch """
+idx = batch.data[0].indices
 
 Review comment:
   Thanks. Very good to know.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10946: Remove kvstore calls from FM example

2018-05-16 Thread GitBox
eric-haibin-lin commented on a change in pull request #10946: Remove kvstore 
calls from FM example
URL: https://github.com/apache/incubator-mxnet/pull/10946#discussion_r188834717
 
 

 ##
 File path: example/sparse/factorization_machine/train.py
 ##
 @@ -75,6 +76,16 @@
 assert(args.data_train is not None and args.data_test is not None), \
   "dataset for training or test is missing"
 
+def batch_row_ids(data_batch):
+""" Generate row ids based on the current mini-batch """
+idx = batch.data[0].indices
 
 Review comment:
   I think mxnet returns a ordered list for NDArrayIter: 
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/io.py#L520 
   short answer: don't use dict.. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin closed issue #10950: The negative samples shall be other classes?

2018-05-16 Thread GitBox
eric-haibin-lin closed issue #10950: The negative samples shall be other 
classes?
URL: https://github.com/apache/incubator-mxnet/issues/10950
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Sampling negative samples other classes only (#10980)

2018-05-16 Thread haibin
This is an automated email from the ASF dual-hosted git repository.

haibin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 632f514  Sampling negative samples other classes only (#10980)
632f514 is described below

commit 632f5140a69b51ff87129e6742e5be46684dc58c
Author: Jon 
AuthorDate: Thu May 17 13:39:37 2018 +0930

Sampling negative samples other classes only (#10980)

The original code will have a chance to sample negative samples from the 
same class of the anchors
---
 example/gluon/embedding_learning/model.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/example/gluon/embedding_learning/model.py 
b/example/gluon/embedding_learning/model.py
index 91f7735..f82240e 100644
--- a/example/gluon/embedding_learning/model.py
+++ b/example/gluon/embedding_learning/model.py
@@ -108,6 +108,7 @@ class DistanceWeightedSampling(HybridBlock):
 mask = np.ones(weights.shape)
 for i in range(0, n, k):
 mask[i:i+k, i:i+k] = 0
+mask_uniform_probs = mask * (1.0/(n-k))
 
 weights = weights * F.array(mask) * (distance < 
self.nonzero_loss_cutoff)
 weights_sum = F.sum(weights, axis=1, keepdims=True)
@@ -125,7 +126,7 @@ class DistanceWeightedSampling(HybridBlock):
 n_indices += np.random.choice(n, k-1, p=np_weights[i]).tolist()
 else:
 # all samples are above the cutoff so we sample uniformly
-n_indices += np.random.choice(n, k-1).tolist()
+n_indices += np.random.choice(n, k-1, 
p=mask_uniform_probs[i]).tolist()
 for j in range(block_idx * k, (block_idx + 1) * k):
 if j != i:
 a_indices.append(i)

-- 
To stop receiving notification emails like this one, please contact
hai...@apache.org.


[GitHub] eric-haibin-lin closed pull request #10980: Sampling negative samples other classes only

2018-05-16 Thread GitBox
eric-haibin-lin closed pull request #10980: Sampling negative samples other 
classes only
URL: https://github.com/apache/incubator-mxnet/pull/10980
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/gluon/embedding_learning/model.py 
b/example/gluon/embedding_learning/model.py
index 91f7735497d..f82240e2cd5 100644
--- a/example/gluon/embedding_learning/model.py
+++ b/example/gluon/embedding_learning/model.py
@@ -108,6 +108,7 @@ def hybrid_forward(self, F, x):
 mask = np.ones(weights.shape)
 for i in range(0, n, k):
 mask[i:i+k, i:i+k] = 0
+mask_uniform_probs = mask * (1.0/(n-k))
 
 weights = weights * F.array(mask) * (distance < 
self.nonzero_loss_cutoff)
 weights_sum = F.sum(weights, axis=1, keepdims=True)
@@ -125,7 +126,7 @@ def hybrid_forward(self, F, x):
 n_indices += np.random.choice(n, k-1, p=np_weights[i]).tolist()
 else:
 # all samples are above the cutoff so we sample uniformly
-n_indices += np.random.choice(n, k-1).tolist()
+n_indices += np.random.choice(n, k-1, 
p=mask_uniform_probs[i]).tolist()
 for j in range(block_idx * k, (block_idx + 1) * k):
 if j != i:
 a_indices.append(i)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] chaoyuaw commented on issue #10980: Sampling negative samples other classes only

2018-05-16 Thread GitBox
chaoyuaw commented on issue #10980: Sampling negative samples other classes only
URL: https://github.com/apache/incubator-mxnet/pull/10980#issuecomment-389737007
 
 
   Thank you @jonbakerfish ! This PR looks good to me. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #10536: [MXNET-317] Add Data Parallel

2018-05-16 Thread GitBox
zhanghang1989 commented on issue #10536: [MXNET-317] Add Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#issuecomment-389736213
 
 
   I have finished editing this repo. Could you start reviewing? 
@eric-haibin-lin @piiswrong Please see the deployed docs 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-10536/12/api/python/gluon/contrib.html?highlight=dataparallel#mxnet.gluon.contrib.parallel.DataParallelModel


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudhacharya closed pull request #10909: ONNX Documentation change.

2018-05-16 Thread GitBox
anirudhacharya closed pull request #10909: ONNX Documentation change.
URL: https://github.com/apache/incubator-mxnet/pull/10909
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md
index 44aabaf4419..f1f3af8e646 100644
--- a/docs/api/python/contrib/onnx.md
+++ b/docs/api/python/contrib/onnx.md
@@ -13,7 +13,7 @@ With ONNX format support for MXNet, developers can build and 
train models with a
 ```
 
 ### Installation Instructions
-- To use this module developers need to **install ONNX**, which requires 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). Once installed, you 
can go through the tutorials on how to use this module.
+- To use this module developers need to **install ONNX**, which requires the 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). **MXNet currently 
supports ONNX v1.1.1**. Once installed, you can go through the tutorials on how 
to use this module.
 
 
 This document describes all the ONNX-MXNet APIs.
@@ -47,4 +47,4 @@ This document describes all the ONNX-MXNet APIs.
 
 ```
 
-auto_index("api-reference");
\ No newline at end of file
+auto_index("api-reference");


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudhacharya commented on issue #10909: ONNX Documentation change.

2018-05-16 Thread GitBox
anirudhacharya commented on issue #10909: ONNX Documentation change.
URL: https://github.com/apache/incubator-mxnet/pull/10909#issuecomment-389736048
 
 
   included in this PR - https://github.com/apache/incubator-mxnet/pull/10512


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: [MXNET-309] [ONNX-MXNet] Model Metadata API (#10512)

2018-05-16 Thread anirudh2290
This is an automated email from the ASF dual-hosted git repository.

anirudh2290 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 7641759  [MXNET-309] [ONNX-MXNet] Model Metadata API (#10512)
7641759 is described below

commit 76417594e56a85ec0cc9412b9dd2c7e2ab581d8b
Author: Anirudh 
AuthorDate: Wed May 16 20:36:54 2018 -0700

[MXNET-309] [ONNX-MXNet] Model Metadata API (#10512)

* metadata api

* pylint changes

* move logic to import_onnx

* test fix

* doc API

* rerun CI.

* fix comments

* docs fix
---
 docs/api/python/contrib/onnx.md   |  6 +++--
 docs/tutorials/onnx/inference_on_onnx_model.md| 19 ++
 example/onnx/super_resolution.py  |  6 ++---
 python/mxnet/contrib/onnx/__init__.py |  3 +--
 python/mxnet/contrib/onnx/_import/import_model.py | 30 +++
 python/mxnet/contrib/onnx/_import/import_onnx.py  | 23 +
 tests/python-pytest/onnx/onnx_test.py | 27 +++-
 7 files changed, 95 insertions(+), 19 deletions(-)

diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md
index 44aabaf..6fb546f 100644
--- a/docs/api/python/contrib/onnx.md
+++ b/docs/api/python/contrib/onnx.md
@@ -13,7 +13,7 @@ With ONNX format support for MXNet, developers can build and 
train models with a
 ```
 
 ### Installation Instructions
-- To use this module developers need to **install ONNX**, which requires 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). Once installed, you 
can go through the tutorials on how to use this module.
+- To use this module developers need to **install ONNX**, which requires the 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). **MXNet currently 
supports ONNX v1.1.1**. Once installed, you can go through the tutorials on how 
to use this module.
 
 
 This document describes all the ONNX-MXNet APIs.
@@ -23,6 +23,7 @@ This document describes all the ONNX-MXNet APIs.
 :nosignatures:
 
 mxnet.contrib.onnx.import_model
+mxnet.contrib.onnx.get_model_metadata
 ```
 
 ## ONNX Tutorials
@@ -43,7 +44,8 @@ This document describes all the ONNX-MXNet APIs.
 ```eval_rst
 
 .. automodule:: mxnet.contrib.onnx
-:members: import_model 
+:members: import_model
+:members: get_model_metadata
 
 ```
 
diff --git a/docs/tutorials/onnx/inference_on_onnx_model.md 
b/docs/tutorials/onnx/inference_on_onnx_model.md
index f342dad..3d4072a 100644
--- a/docs/tutorials/onnx/inference_on_onnx_model.md
+++ b/docs/tutorials/onnx/inference_on_onnx_model.md
@@ -104,17 +104,26 @@ We pick a context, GPU if available, otherwise CPU
 ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
 ```
 
-We obtain the data names of the inputs to the model, by listing all the inputs 
to the symbol graph and excluding the argument and auxiliary parameters from 
that list:
+We obtain the data names of the inputs to the model by using the model 
metadata API: 
 
 ```python
-data_names = [graph_input for graph_input in sym.list_inputs()
-  if graph_input not in arg_params and graph_input not in 
aux_params]
-print(data_names)
+model_metadata = onnx_mxnet.get_model_metadata(onnx_path)
+print(model_metadata)
 ```
 
+```
+{'output_tensor_data': [(u'gpu_0/softmax_1', (1L, 1000L))],
+ 'input_tensor_data': [(u'gpu_0/data_0', (1L, 3L, 224L, 224L))]}
+```
 
-```['gpu_0/data_0']```
+```python
+data_names = [inputs[0] for inputs in model_metadata.get('input_tensor_data')]
+print(data_names)
+```
 
+```
+[u'gpu_0/data_0']
+```
 
 And load them into a MXNet Gluon symbol block. 
 
diff --git a/example/onnx/super_resolution.py b/example/onnx/super_resolution.py
index a52f1a8..fcb8ccc 100644
--- a/example/onnx/super_resolution.py
+++ b/example/onnx/super_resolution.py
@@ -55,10 +55,8 @@ def get_test_image():
 
 def perform_inference(sym, arg_params, aux_params, input_img, img_cb, img_cr):
 """Perform inference on image using mxnet"""
-# To fetch the data names of the input to the model we list the inputs of 
the symbol graph
-# and exclude the argument and auxiliary parameters from the list
-data_names = [graph_input for graph_input in sym.list_inputs()
-  if graph_input not in arg_params and graph_input not in 
aux_params]
+metadata = onnx_mxnet.get_model_metadata('super_resolution.onnx')
+data_names = [input_name[0] for input_name in 
metadata.get('input_tensor_data')]
 # create module
 mod = mx.mod.Module(symbol=sym, data_names=data_names, label_names=None)
 

[GitHub] anirudh2290 closed pull request #10512: [MXNET-309] [ONNX-MXNet] Model Metadata API

2018-05-16 Thread GitBox
anirudh2290 closed pull request #10512: [MXNET-309] [ONNX-MXNet] Model Metadata 
API
URL: https://github.com/apache/incubator-mxnet/pull/10512
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md
index 44aabaf4419..6fb546fc2b4 100644
--- a/docs/api/python/contrib/onnx.md
+++ b/docs/api/python/contrib/onnx.md
@@ -13,7 +13,7 @@ With ONNX format support for MXNet, developers can build and 
train models with a
 ```
 
 ### Installation Instructions
-- To use this module developers need to **install ONNX**, which requires 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). Once installed, you 
can go through the tutorials on how to use this module.
+- To use this module developers need to **install ONNX**, which requires the 
protobuf compiler to be installed separately. Please follow the [instructions 
to install ONNX and its 
dependencies](https://github.com/onnx/onnx#installation). **MXNet currently 
supports ONNX v1.1.1**. Once installed, you can go through the tutorials on how 
to use this module.
 
 
 This document describes all the ONNX-MXNet APIs.
@@ -23,6 +23,7 @@ This document describes all the ONNX-MXNet APIs.
 :nosignatures:
 
 mxnet.contrib.onnx.import_model
+mxnet.contrib.onnx.get_model_metadata
 ```
 
 ## ONNX Tutorials
@@ -43,7 +44,8 @@ This document describes all the ONNX-MXNet APIs.
 ```eval_rst
 
 .. automodule:: mxnet.contrib.onnx
-:members: import_model 
+:members: import_model
+:members: get_model_metadata
 
 ```
 
diff --git a/docs/tutorials/onnx/inference_on_onnx_model.md 
b/docs/tutorials/onnx/inference_on_onnx_model.md
index f342dad9bea..3d4072a5415 100644
--- a/docs/tutorials/onnx/inference_on_onnx_model.md
+++ b/docs/tutorials/onnx/inference_on_onnx_model.md
@@ -104,17 +104,26 @@ We pick a context, GPU if available, otherwise CPU
 ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
 ```
 
-We obtain the data names of the inputs to the model, by listing all the inputs 
to the symbol graph and excluding the argument and auxiliary parameters from 
that list:
+We obtain the data names of the inputs to the model by using the model 
metadata API: 
 
 ```python
-data_names = [graph_input for graph_input in sym.list_inputs()
-  if graph_input not in arg_params and graph_input not in 
aux_params]
-print(data_names)
+model_metadata = onnx_mxnet.get_model_metadata(onnx_path)
+print(model_metadata)
 ```
 
+```
+{'output_tensor_data': [(u'gpu_0/softmax_1', (1L, 1000L))],
+ 'input_tensor_data': [(u'gpu_0/data_0', (1L, 3L, 224L, 224L))]}
+```
 
-```['gpu_0/data_0']```
+```python
+data_names = [inputs[0] for inputs in model_metadata.get('input_tensor_data')]
+print(data_names)
+```
 
+```
+[u'gpu_0/data_0']
+```
 
 And load them into a MXNet Gluon symbol block. 
 
diff --git a/example/onnx/super_resolution.py b/example/onnx/super_resolution.py
index a52f1a892a6..fcb8ccc88ed 100644
--- a/example/onnx/super_resolution.py
+++ b/example/onnx/super_resolution.py
@@ -55,10 +55,8 @@ def get_test_image():
 
 def perform_inference(sym, arg_params, aux_params, input_img, img_cb, img_cr):
 """Perform inference on image using mxnet"""
-# To fetch the data names of the input to the model we list the inputs of 
the symbol graph
-# and exclude the argument and auxiliary parameters from the list
-data_names = [graph_input for graph_input in sym.list_inputs()
-  if graph_input not in arg_params and graph_input not in 
aux_params]
+metadata = onnx_mxnet.get_model_metadata('super_resolution.onnx')
+data_names = [input_name[0] for input_name in 
metadata.get('input_tensor_data')]
 # create module
 mod = mx.mod.Module(symbol=sym, data_names=data_names, label_names=None)
 mod.bind(for_training=False, data_shapes=[(data_names[0], 
input_img.shape)])
diff --git a/python/mxnet/contrib/onnx/__init__.py 
b/python/mxnet/contrib/onnx/__init__.py
index 169ac673455..fb8488ca4f2 100644
--- a/python/mxnet/contrib/onnx/__init__.py
+++ b/python/mxnet/contrib/onnx/__init__.py
@@ -14,7 +14,6 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
 """Module for ONNX model format support for Apache MXNet."""
 
-from ._import.import_model import import_model
+from ._import.import_model import import_model, get_model_metadata
diff --git a/python/mxnet/contrib/onnx/_import/import_model.py 
b/python/mxnet/contrib/onnx/_import/import_model.py
index 1bd4b418bc3..4e4d7863755 100644
--- a/python/mxnet/contrib/onnx/_import/import_model.py
+++ 

[GitHub] mrkumar83 commented on issue #10957: Make inner transform activation configurable for LSTMCell

2018-05-16 Thread GitBox
mrkumar83 commented on issue #10957: Make inner transform activation 
configurable for LSTMCell
URL: https://github.com/apache/incubator-mxnet/pull/10957#issuecomment-389722273
 
 
   Added parameters, recurrent_activation and activation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-05-16 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 87b677d  Bump the publish timestamp.
87b677d is described below

commit 87b677d6d328ec68b9152d573da1498aa54ca4f8
Author: mxnet-ci 
AuthorDate: Thu May 17 02:09:16 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..429839d
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Thu May 17 02:09:16 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.


[GitHub] phoenixbai opened a new issue #10981: build ends with error : elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before ‘template’

2018-05-16 Thread GitBox
phoenixbai opened a new issue #10981: build ends with error : 
elemwise_binary_op-inl.h:78:3574: error: expected primary-expression before 
‘template’
URL: https://github.com/apache/incubator-mxnet/issues/10981
 
 
   ## Description
   I tried to build from source, and it exits with error below:
   I try googling, and fail to find anything helpful.
   What could be the cause ? please help.
   ```
   src/operator/tensor/./elemwise_binary_op-inl.h: In static member function 
‘static void 
mxnet::op::ElemwiseBinaryOp::RspRspOp(mshadow::Stream*, const 
nnvm::NodeAttrs&, const mxnet::OpContext&, const mxnet::NDArray&, const 
mxnet::NDArray&, mxnet::OpReqType, const mxnet::NDArray&, bool, bool, bool, 
bool)’:
   src/operator/tensor/./elemwise_binary_op-inl.h:78:3574: error: expected 
primary-expression before ‘template’
  MSHADOW_IDX_TYPE_SWITCH(rsp.aux_type(rowsparse::kIdx), IType, {












































 ^
   src/operator/tensor/./elemwise_binary_op-inl.h:78:3675: error: expected 
primary-expression before ‘template’
  MSHADOW_IDX_TYPE_SWITCH(rsp.aux_type(rowsparse::kIdx), IType, {
   ```
   
   ## Environment info (Required)
   
   ```
   $python diagnose.py
   --Python Info--
   ('Version  :', '2.7.14')
   ('Compiler :', 'GCC 7.2.0')
   ('Build:', ('default', 'Dec  7 2017 17:05:42'))
   ('Arch :', ('64bit', ''))
   Pip 

[GitHub] chinakook commented on issue #10972: Gluon's PReLU is very slow and a fix to it

2018-05-16 Thread GitBox
chinakook commented on issue #10972: Gluon's PReLU is very slow and a fix to it
URL: 
https://github.com/apache/incubator-mxnet/issues/10972#issuecomment-389720576
 
 
   Yes, I investigated the leakyrelu-inl.h source code. There are indeed a 
broadcast operation when the shape of param 'alpha' is 1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xinyu-intel commented on issue #10613: Add Windows MKLDNN Building Instruction

2018-05-16 Thread GitBox
xinyu-intel commented on issue #10613: Add Windows MKLDNN Building Instruction
URL: https://github.com/apache/incubator-mxnet/pull/10613#issuecomment-389716125
 
 
   @zheng-da @marcoabreu  hi, please help review this pr. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jonbakerfish commented on issue #10950: The negative samples shall be other classes?

2018-05-16 Thread GitBox
jonbakerfish commented on issue #10950: The negative samples shall be other 
classes?
URL: 
https://github.com/apache/incubator-mxnet/issues/10950#issuecomment-389712212
 
 
   Hi @chaoyuaw , a PR is created 
https://github.com/apache/incubator-mxnet/pull/10980. Please have a look. 
Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jonbakerfish opened a new pull request #10980: Sampling negative samples other classes only

2018-05-16 Thread GitBox
jonbakerfish opened a new pull request #10980: Sampling negative samples other 
classes only
URL: https://github.com/apache/incubator-mxnet/pull/10980
 
 
   ## Description ##
   The original code will have a chance to sample negative samples from the 
same class of the anchors
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aaronmarkham commented on issue #10978: removed script tags that are being interpreted as html

2018-05-16 Thread GitBox
aaronmarkham commented on issue #10978: removed script tags that are being 
interpreted as html
URL: https://github.com/apache/incubator-mxnet/pull/10978#issuecomment-389710197
 
 
   FYI, local builds show that the page is being interpreted just fine and tags 
are being properly escaped and even rendering properly. I tried a few 
variations and it appears to be fine. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #10852: [MXNET-411] Add ROI Align

2018-05-16 Thread GitBox
zhanghang1989 commented on issue #10852: [MXNET-411] Add ROI Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#issuecomment-389709407
 
 
   @piiswrong @zhreshold I have added the unit tests. Please see the updates. 
Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.

2018-05-16 Thread GitBox
pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/10979#issuecomment-389707273
 
 
   @zheng-da  Could you add more descriptions (or change title) in order to 
make the PR readable?
   
   please try the case a3c, `const mkldnn::memory *NDArray::GetMKLDNNData` will 
return nullptr since two mem format is not equal.
   Do you think we need to fix this issue in this PR too? 
   
https://github.com/apache/incubator-mxnet/tree/master/example/reinforcement-learning/a3c
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.

2018-05-16 Thread GitBox
pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/10979#issuecomment-389707273
 
 
   @zheng-da  Could you add more descriptions (or change title) in order to 
make the PR readable?
   
   please try the case a3c, GetMKLDNNData will return nullptr since two mem 
format is not equal.
   Do you think we need to fix this issue in this PR too? 
   
https://github.com/apache/incubator-mxnet/tree/master/example/reinforcement-learning/a3c
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.

2018-05-16 Thread GitBox
pengzhao-intel commented on issue #10979: Fix bugs in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/10979#issuecomment-389707273
 
 
   @zheng-da  Could you add more descriptions (or change title) in order to 
make the PR readable?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azai91 commented on issue #10836: mxnet-mkldnn(v1.2.0) to use whole CPU cores on machines without hyperthreading

2018-05-16 Thread GitBox
azai91 commented on issue #10836: mxnet-mkldnn(v1.2.0) to use whole CPU cores 
on machines without hyperthreading
URL: 
https://github.com/apache/incubator-mxnet/issues/10836#issuecomment-389704315
 
 
   Is there any issue with just using the following to get the number of cores?
   
   ```
   #include 
unsigned concurentThreadsSupported = std::thread::hardware_concurrency();
   ```
   
   
https://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yifeim commented on a change in pull request #10946: Remove kvstore calls from FM example

2018-05-16 Thread GitBox
yifeim commented on a change in pull request #10946: Remove kvstore calls from 
FM example
URL: https://github.com/apache/incubator-mxnet/pull/10946#discussion_r188799809
 
 

 ##
 File path: example/sparse/factorization_machine/train.py
 ##
 @@ -75,6 +76,16 @@
 assert(args.data_train is not None and args.data_test is not None), \
   "dataset for training or test is missing"
 
+def batch_row_ids(data_batch):
+""" Generate row ids based on the current mini-batch """
+idx = batch.data[0].indices
 
 Review comment:
   Typo: data_batch <=> batch. Also, consider what to do when the batch 
contains multiple data fields? To elaborate, if my batch.data contains multiple 
sources, provided through dictionary (order is not enforced), how do I address 
the correct field? batch.provide_data returned None to me. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azai91 commented on issue #10836: mxnet-mkldnn(v1.2.0) to use whole CPU cores on machines without hyperthreading

2018-05-16 Thread GitBox
azai91 commented on issue #10836: mxnet-mkldnn(v1.2.0) to use whole CPU cores 
on machines without hyperthreading
URL: 
https://github.com/apache/incubator-mxnet/issues/10836#issuecomment-389704315
 
 
   Is there any issue with just using the following to get the number of cores?
   
   ```
   #include 
unsigned concurentThreadsSupported = std::thread::hardware_concurrency();
   ```
   
   
https://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da opened a new pull request #10979: Fix bugs in MKLDNN.

2018-05-16 Thread GitBox
zheng-da opened a new pull request #10979: Fix bugs in MKLDNN.
URL: https://github.com/apache/incubator-mxnet/pull/10979
 
 
   ## Description ##
   (Brief description on what this PR is about)
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on issue #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on issue #10959: [MXNET-423] Gluon Model Zoo Pre 
Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#issuecomment-389695869
 
 
   @indhub Thanks for the review! I have updated accordingly
   
   I have linked to 2 fine-tuning tutorials which cover quite well the subject, 
especially the straight dope one since it is using a model from the model zoo 
as well. 
   
   As for freezing the parameters during fine-tuning, my experience has been 
that it consistently underperformed whole network fine-tuning, even with 
smaller datasets. Do you have any use-case or datasets where freezing 
parameters during fine-tuning would be advantageous? In my tests, I thought I 
would at least see faster convergence when freezing all but the last layer, but 
it took roughly the same time and got worse accuracy than full network training.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on issue #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on issue #10959: [MXNET-423] Gluon Model Zoo Pre 
Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#issuecomment-389695869
 
 
   @indhub Thanks for the review! I have updated accordingly
   
   I have linked to 2 fine-tuning tutorials which cover quite well the subject, 
especially the straight dope one since it is using a model from the model zoo 
as well. 
   
   As for freezing the parameters during fine-tuning, my experience has been 
that it consistently underperformed compared to full network fine-tuning, even 
with smaller datasets. Do you have any use-case or datasets where freezing 
parameters during fine-tuning would be advantageous? In my tests, I thought I 
would at least see faster convergence when freezing all but the last layer, but 
it took roughly the same time and got worse accuracy than full network training.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] indhub commented on a change in pull request #10956: [MXNET-307] Fix flaky tutorial tests from CI

2018-05-16 Thread GitBox
indhub commented on a change in pull request #10956: [MXNET-307] Fix flaky 
tutorial tests from CI
URL: https://github.com/apache/incubator-mxnet/pull/10956#discussion_r188796904
 
 

 ##
 File path: docs/tutorials/python/predict_image.md
 ##
 @@ -1,33 +1,28 @@
 # Predict with pre-trained models
 
-This tutorial explains how to recognize objects in an image with a
-pre-trained model, and how to perform feature extraction.
+This tutorial explains how to recognize objects in an image with a pre-trained 
model, and how to perform feature extraction.
 
 ## Prerequisites
 
 To complete this tutorial, we need:
 
 - MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/install/index.html)
 
-- [Python Requests](http://docs.python-requests.org/en/master/), 
[Matplotlib](https://matplotlib.org/) and [Jupyter 
Notebook](http://jupyter.org/index.html).
+- [Matplotlib](https://matplotlib.org/) and [Jupyter 
Notebook](http://jupyter.org/index.html).
 
 ```
-$ pip install requests matplotlib jupyter opencv-python
+$ pip install matplotlib
 ```
 
 ## Loading
 
-We first download a pre-trained ResNet 152 layer that is trained on the full
-ImageNet dataset with over 10 million images and 10 thousand classes. A
-pre-trained model contains two parts, a json file containing the model
-definition and a binary file containing the parameters. In addition, there may 
be
-a text file for the labels.
+We first download a pre-trained ResNet 18 layer that is trained on the 
ImageNet dataset with over 1 million images and one thousand classes. A 
pre-trained model contains two parts, a json file containing the model 
definition and a binary file containing the parameters. In addition, there may 
be a `synset.txt` text file for the labels.
 
 Review comment:
   'ResNet 18 model' or 'ResNet 18 layer model'


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r188791651
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same data set and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed can vary, and that's an important factor for many applications. By trying 
a few pretrained models, you have an opportunity to find a model that can be a 
good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the 
`classes=10` for example
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16): BatchNorm(axis=1, eps=1e-05, momentum=0.9, 

[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r188788759
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same data set and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed can vary, and that's an important factor for many applications. By trying 
a few pretrained models, you have an opportunity to find a model that can be a 
good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the 
`classes=10` for example
 
 Review comment:
   using the `classes` parameter. For example,


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r188792363
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same data set and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed can vary, and that's an important factor for many applications. By trying 
a few pretrained models, you have an opportunity to find a model that can be a 
good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the 
`classes=10` for example
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16): BatchNorm(axis=1, eps=1e-05, momentum=0.9, 

[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-16 Thread GitBox
indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r188791808
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same data set and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed can vary, and that's an important factor for many applications. By trying 
a few pretrained models, you have an opportunity to find a model that can be a 
good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the 
`classes=10` for example
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16): BatchNorm(axis=1, eps=1e-05, momentum=0.9, 

[GitHub] aaronmarkham opened a new pull request #10978: removed script tags that are being interpreted as html

2018-05-16 Thread GitBox
aaronmarkham opened a new pull request #10978: removed script tags that are 
being interpreted as html
URL: https://github.com/apache/incubator-mxnet/pull/10978
 
 
   ## Description ##
   Minor bug fix for docs rendered on the site. 
   
   Markdown with tags in them are being converted to working html tags. 
Probably a fairly interesting security vulnerability, but I'm just fixing how 
the site looks for now by removing the offending tags from the markdown.
   
   

[GitHub] zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-16 Thread GitBox
zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r188783355
 
 

 ##
 File path: tests/python/unittest/test_operator.py
 ##
 @@ -6018,6 +6019,24 @@ def test_context_num_gpus():
 if str(e).find("CUDA") == -1:
 raise e
 
+
+@with_seed()
+def test_op_roi_align():
+ctx=default_context()
+data = mx.symbol.Variable(name='data')
+rois = mx.symbol.Variable(name='rois')
+test = mx.symbol.contrib.ROIAlign(data=data, rois=rois, pooled_size=(4, 
4), spatial_scale=1)
+
+x1 = np.random.rand(4, 1, 12, 12).astype('float64')
+x2 = np.array([[0, 1.1, 1.1, 6.2, 6.2], [2, 6.1, 2.1, 8.2, 11.2], [1, 3.1, 
1.1, 5.2, 10.2]], dtype='float64')
+
+check_numeric_gradient(sym=test, location=[x1, x2],
+   grad_nodes={'data':'write', 'rois':'null'},
+   numeric_eps=1e-4, rtol=1e-1, atol=1e-4)
+check_numeric_gradient(sym=test, location=[x1, x2],
+   grad_nodes={'data':'add', 'rois':'null'},
+   numeric_eps=1e-4, rtol=1e-1, atol=1E-4)
+
 
 Review comment:
   Yeah, will a forward result check soon  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-16 Thread GitBox
zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r188774784
 
 

 ##
 File path: src/operator/contrib/roi_align.cu
 ##
 @@ -0,0 +1,496 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cu
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+#define CUDA_1D_KERNEL_LOOP(i, n) \
+  for (size_t i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
+   i += blockDim.x * gridDim.x)
+
+// The number of cuda threads to use. 512 is used for backward compatibility
+constexpr int ROI_CUDA_NUM_THREADS = 512;
 
 Review comment:
   Use mshadow::cuda::kMaxThreadsPerBlock might provide better perf on newer 
opus possibly?
   Use mshadow::cuda::CheckLaunchParam to help check the launch limits


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-16 Thread GitBox
zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r188775054
 
 

 ##
 File path: src/operator/contrib/roi_align.cu
 ##
 @@ -0,0 +1,496 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cu
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+#define CUDA_1D_KERNEL_LOOP(i, n) \
+  for (size_t i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
+   i += blockDim.x * gridDim.x)
+
+// The number of cuda threads to use. 512 is used for backward compatibility
+constexpr int ROI_CUDA_NUM_THREADS = 512;
+// The maximum number of blocks to use in the default kernel call.
+constexpr int ROI_MAXIMUM_NUM_BLOCKS = 4096;
+
+/**
+ * @brief Compute the number of blocks needed to run N threads.
+ */
+inline int ROI_GET_BLOCKS(const int N) {
+  return std::max(
+  std::min(
+  (N + ROI_CUDA_NUM_THREADS - 1) / ROI_CUDA_NUM_THREADS,
+  ROI_MAXIMUM_NUM_BLOCKS),
+  // Use at least 1 block, since CUDA does not allow empty block
+  1);
+}
+
+
+template 
+__device__ T bilinear_interpolate(
+const T* bottom_data,
+const int height,
+const int width,
+T y,
+T x,
+const int index /* index for debug only*/) {
+  // deal with cases that inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+return 0;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  // do bilinear interpolation
+  T v1 = bottom_data[y_low * width + x_low];
+  T v2 = bottom_data[y_low * width + x_high];
+  T v3 = bottom_data[y_high * width + x_low];
+  T v4 = bottom_data[y_high * width + x_high];
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+
+  return val;
+}
+
+template 
+__global__ void RoIAlignForwardKernel(
+const int nthreads,
+const T* bottom_data,
+const T spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+T* top_data) {
+  CUDA_1D_KERNEL_LOOP(index, nthreads) {
+// (n, c, ph, pw) is an element in the pooled output
+int pw = index % pooled_width;
+int ph = (index / pooled_width) % pooled_height;
+int c = (index / pooled_width / pooled_height) % channels;
+int n = index / pooled_width / pooled_height / channels;
+
+const T* offset_bottom_rois = bottom_rois + n * 5;
+int roi_batch_ind = offset_bottom_rois[0];
+
+// Do not using rounding; this implementation detail is critical
+T roi_start_w = offset_bottom_rois[1] * spatial_scale;
+T roi_start_h = offset_bottom_rois[2] * spatial_scale;
+T roi_end_w = offset_bottom_rois[3] * spatial_scale;
+T roi_end_h = offset_bottom_rois[4] * spatial_scale;
+// T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
+// T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
+// T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
+// T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);
+
+// Force malformed ROIs to be 1x1
+T roi_width = max(roi_end_w - roi_start_w, (T)1.);
+T roi_height = max(roi_end_h - roi_start_h, (T)1.);
+T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
+T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
+
+const T* offset_bottom_data =
+bottom_data + (roi_batch_ind * channels + 

[GitHub] zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-16 Thread GitBox
zhreshold commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r188777217
 
 

 ##
 File path: tests/python/unittest/test_operator.py
 ##
 @@ -6018,6 +6019,24 @@ def test_context_num_gpus():
 if str(e).find("CUDA") == -1:
 raise e
 
+
+@with_seed()
+def test_op_roi_align():
+ctx=default_context()
+data = mx.symbol.Variable(name='data')
+rois = mx.symbol.Variable(name='rois')
+test = mx.symbol.contrib.ROIAlign(data=data, rois=rois, pooled_size=(4, 
4), spatial_scale=1)
+
+x1 = np.random.rand(4, 1, 12, 12).astype('float64')
+x2 = np.array([[0, 1.1, 1.1, 6.2, 6.2], [2, 6.1, 2.1, 8.2, 11.2], [1, 3.1, 
1.1, 5.2, 10.2]], dtype='float64')
+
+check_numeric_gradient(sym=test, location=[x1, x2],
+   grad_nodes={'data':'write', 'rois':'null'},
+   numeric_eps=1e-4, rtol=1e-1, atol=1e-4)
+check_numeric_gradient(sym=test, location=[x1, x2],
+   grad_nodes={'data':'add', 'rois':'null'},
+   numeric_eps=1e-4, rtol=1e-1, atol=1E-4)
+
 
 Review comment:
   need a forward result check in addition to gradient check


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10969: Mxnet dynamic graphs achieve scoure code

2018-05-16 Thread GitBox
ankkhedia commented on issue #10969: Mxnet dynamic graphs achieve scoure code 
URL: 
https://github.com/apache/incubator-mxnet/issues/10969#issuecomment-389655622
 
 
   Hi @qichaotang 
   Could you please confirm if the query got answered and close the issue.
   @sandeep-krishnamurthy 
   Please tag this as question and pending requester info


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on issue #10961: [MXNET-10896] [MXNET-10901] Fix test_sparse_mathematical_core sensitivity to scipy v1.1

2018-05-16 Thread GitBox
marcoabreu commented on issue #10961: [MXNET-10896] [MXNET-10901] Fix 
test_sparse_mathematical_core sensitivity to scipy v1.1
URL: https://github.com/apache/incubator-mxnet/pull/10961#issuecomment-389663152
 
 
   Sorry, on the phone right now. It's in CI/docker/install and then in CentOS
   and Ubuntu Python
   
   Dick Carter  schrieb am Mi., 16. Mai 2018, 21:52:
   
   > A search for scipy in the MXNet repo shows many pip installs that don't
   > set the version number, particularly in the examples docs. I think it
   > better to be immune to the varying scipy behavior than be stuck using v1.0
   > forever, plus have to handle false bug reports from users that have
   > mistakenly upgraded to v1.1.
   >
   > —
   > You are receiving this because you commented.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10948: get stuck with subprocess in multithread

2018-05-16 Thread GitBox
ankkhedia commented on issue #10948: get stuck with subprocess in multithread
URL: 
https://github.com/apache/incubator-mxnet/issues/10948#issuecomment-389662914
 
 
   @sandeep-krishnamurthy Could you please tag issue as python and bug


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.

2018-05-16 Thread GitBox
ankkhedia commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: 
Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.
URL: 
https://github.com/apache/incubator-mxnet/issues/10964#issuecomment-389662341
 
 
   @sxksxy Hope your issue got resolved.
   @sandeep-krishnamurthy Could you please close the issue and label it as 
operator
   
   @sxksxy Please feel free to re open the issue in case it is closed in error


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10952: The params can't be saved as expected when using --model-prefix for all the training script belonging to example/image-classification

2018-05-16 Thread GitBox
ankkhedia commented on issue #10952: The params can't be saved as expected when 
using --model-prefix for all the training script belonging to 
example/image-classification
URL: 
https://github.com/apache/incubator-mxnet/issues/10952#issuecomment-389661435
 
 
   @sandeep-krishnamurthy Could you please tag this as python and bug


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10950: The negative samples shall be other classes?

2018-05-16 Thread GitBox
ankkhedia commented on issue #10950: The negative samples shall be other 
classes?
URL: 
https://github.com/apache/incubator-mxnet/issues/10950#issuecomment-389660473
 
 
   @sandeep-krishnamurthy Could you please tag this as Call for Contribution


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10967: Is the doc for launch.py parameters outdated?

2018-05-16 Thread GitBox
ankkhedia commented on issue #10967: Is the doc for launch.py parameters 
outdated?
URL: 
https://github.com/apache/incubator-mxnet/issues/10967#issuecomment-389657260
 
 
   @sandeep-krishnamurthy Could you please tag this as Question and 
Documentation


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10968: Discuss about NDArray::Save and NDArray::Load

2018-05-16 Thread GitBox
ankkhedia commented on issue #10968: Discuss about NDArray::Save and 
NDArray::Load
URL: 
https://github.com/apache/incubator-mxnet/issues/10968#issuecomment-389656689
 
 
   @sandeep-krishnamurthy Could you please tag this issue as Discussion and 
NDArray


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on issue #10975: avoid importing docker_cache if feature is not used

2018-05-16 Thread GitBox
marcoabreu commented on issue #10975: avoid importing docker_cache if feature 
is not used
URL: https://github.com/apache/incubator-mxnet/pull/10975#issuecomment-389656578
 
 
   Please fill in the template 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ankkhedia commented on issue #10969: Mxnet dynamic graphs achieve scoure code

2018-05-16 Thread GitBox
ankkhedia commented on issue #10969: Mxnet dynamic graphs achieve scoure code 
URL: 
https://github.com/apache/incubator-mxnet/issues/10969#issuecomment-389655622
 
 
   Hi @qichaotang 
   Could you please confirm if they query got answered and close the issue.
   @sandeep-krishnamurthy 
   Please tag this as question and pending requester info


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #10433: [MXNET-290] MKLDNN support for model quantization

2018-05-16 Thread GitBox
reminisce commented on a change in pull request #10433: [MXNET-290] MKLDNN 
support for model quantization
URL: https://github.com/apache/incubator-mxnet/pull/10433#discussion_r188752161
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1423,13 +1423,15 @@ MXNET_DLL int MXSymbolInferType(SymbolHandle sym,
  * \param excluded_symbols array of symbols to be excluded from being quantized
  * \param num_offline number of parameters that are quantized offline
  * \param offline_params array of c strings representing the names of params 
quantized offline
+ * \param dev_type device type 
  */
 MXNET_DLL int MXQuantizeSymbol(SymbolHandle sym_handle,
SymbolHandle *ret_sym_handle,
const mx_uint num_excluded_symbols,
const SymbolHandle *excluded_symbols,
const mx_uint num_offline,
-   const char **offline_params);
+   const char **offline_params,
+   int dev_type);
 
 Review comment:
   @jinhuang415 I think this design is clearer than before. I agree with adding 
config params to the interface. One minor suggestion, instead of defining 
boolean value `use_uint8`, it's more general to define a param such as 
`quantized_dtype` default to `int8`. This allows us to pass other low-precision 
data types to the backend.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] DickJC123 commented on issue #10961: [MXNET-10896] [MXNET-10901] Fix test_sparse_mathematical_core sensitivity to scipy v1.1

2018-05-16 Thread GitBox
DickJC123 commented on issue #10961: [MXNET-10896] [MXNET-10901] Fix 
test_sparse_mathematical_core sensitivity to scipy v1.1
URL: https://github.com/apache/incubator-mxnet/pull/10961#issuecomment-389644489
 
 
   A search for scipy in the MXNet repo shows many pip installs that don't set 
the version number, particularly in the examples docs.  I think it better to be 
immune to the varying scipy behavior than be stuck using v1.0 forever, plus 
have to handle false bug reports from users that have mistakenly upgraded to 
v1.1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] DickJC123 opened a new issue #10977: test_kvstore_gpu.py:test_rsp_push_pull fails when run on a system with only one GPU

2018-05-16 Thread GitBox
DickJC123 opened a new issue #10977: test_kvstore_gpu.py:test_rsp_push_pull 
fails when run on a system with only one GPU
URL: https://github.com/apache/incubator-mxnet/issues/10977
 
 
   ## Description
   The test test_rsp_push_pull fails when it attempts to create and use context 
mx.gpu(1) even if only one GPU is present.  The test should ideally perform a 
reduced level of testing in the absence of 2 GPUs, rather than just fail.
   
   ## Environment info (Required)
   
   ```
   --Python Info--
   ('Version  :', '2.7.12')
   ('Compiler :', 'GCC 5.4.0 20160609')
   ('Build:', ('default', 'Dec  4 2017 14:50:18'))
   ('Arch :', ('64bit', 'ELF'))
   Pip Info---
   ('Version  :', '10.0.1')
   ('Directory:', '/home/dcarter/.local/lib/python2.7/site-packages/pip')
   --MXNet Info---
   /home/dcarter/mxnet_dev/dgx/mxnet/python/mxnet/optimizer.py:136: 
UserWarning: WARNING: New optimizer mxnet.optimizer.NAG is overriding existing 
optimizer mxnet.optimizer.NAG
 Optimizer.opt_registry[name].__name__))
   ('Version  :', '1.1.0')
   ('Directory:', '/home/dcarter/mxnet_dev/dgx/mxnet/python/mxnet')
   Hashtag not found. Not installed from pre-built package.
   --System Info--
   ('Platform :', 'Linux-4.4.0-121-generic-x86_64-with-Ubuntu-16.04-xenial')
   ('system   :', 'Linux')
   ('node :', 'DCARTER-DT')
   ('release  :', '4.4.0-121-generic')
   ('version  :', '#145-Ubuntu SMP Fri Apr 13 13:47:23 UTC 2018')
   --Hardware Info--
   ('machine  :', 'x86_64')
   ('processor:', 'x86_64')
   Architecture:  x86_64
   CPU op-mode(s):32-bit, 64-bit
   Byte Order:Little Endian
   CPU(s):12
   On-line CPU(s) list:   0-11
   Thread(s) per core:2
   Core(s) per socket:6
   Socket(s): 1
   NUMA node(s):  1
   Vendor ID: GenuineIntel
   CPU family:6
   Model: 63
   Model name:Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz
   Stepping:  2
   CPU MHz:   3693.867
   CPU max MHz:   3700.
   CPU min MHz:   1200.
   BogoMIPS:  6996.26
   Virtualization:VT-x
   L1d cache: 32K
   L1i cache: 32K
   L2 cache:  256K
   L3 cache:  15360K
   NUMA node0 CPU(s): 0-11
   Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 
ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single 
retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 
avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat 
pln pts
   
   ```
   
   ## Error Message:
   Visible in the repro example below:
   
   ## Minimum reproducible example
   This error can be reproduced on a multi-gpu system by restricting the 
visibility of the GPUs:
   ```
   $ CUDA_VISIBLE_DEVICES=0 nosetests --verbose -s 
tests/python/gpu/test_kvstore_gpu.py:test_rsp_push_pull
   /home/dcarter/mxnet_dev/dgx/mxnet/python/mxnet/optimizer.py:136: 
UserWarning: WARNING: New optimizer mxnet.optimizer.NAG is overriding existing 
optimizer mxnet.optimizer.NAG
 Optimizer.opt_registry[name].__name__))
   [INFO] Setting module np/mx/python random seeds, use 
MXNET_MODULE_SEED=1277946489 to reproduce.
   test_kvstore_gpu.test_rsp_push_pull ... terminate called after throwing an 
instance of 'dmlc::Error'
 what():  [12:09:48] 
/home/dcarter/mxnet_dev/dgx/mxnet/mshadow/mshadow/./tensor_gpu-inl.h:35: Check 
failed: e == cudaSuccess CUDA: invalid device ordinal
   
   Stack trace returned 9 entries:
   [bt] (0) 
/home/dcarter/mxnet_dev/dgx/mxnet/lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5a)
 [0x7f37871a164a]
   [bt] (1) 
/home/dcarter/mxnet_dev/dgx/mxnet/lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28)
 [0x7f37871a21e8]
   [bt] (2) /home/dcarter/mxnet_dev/dgx/mxnet/lib/libmxnet.so(void 
mshadow::SetDevice(int)+0xd0) [0x7f3789d2e140]
   [bt] (3) /home/dcarter/mxnet_dev/dgx/mxnet/lib/libmxnet.so(void 
mxnet::engine::ThreadedEnginePerDevice::GPUWorker<(dmlc::ConcurrentQueueType)0>(mxnet::Context,
 bool, 
mxnet::engine::ThreadedEnginePerDevice::ThreadWorkerBlock<(dmlc::ConcurrentQueueType)0>*,
 std::shared_ptr const&)+0x87) 
[0x7f3789d38117]
   [bt] (4) 
/home/dcarter/mxnet_dev/dgx/mxnet/lib/libmxnet.so(std::_Function_handler), 
mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, 
bool)::{lambda()#3}::operator()() 
const::{lambda(std::shared_ptr)#1}>::_M_invoke(std::_Any_data
 

[GitHub] mbaijal commented on a change in pull request #10938: Add Apachev2 License for contrib

2018-05-16 Thread GitBox
mbaijal commented on a change in pull request #10938: Add Apachev2 License for 
contrib
URL: https://github.com/apache/incubator-mxnet/pull/10938#discussion_r188735823
 
 

 ##
 File path: src/operator/contrib/psroi_pooling-inl.h
 ##
 @@ -1,7 +1,23 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 /*!
- * Copyright (c) 2017 by Contributors
- * Copyright (c) 2017 Microsoft
 
 Review comment:
   I am not sure about removing this. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10972: Gluon's PReLU is very slow and a fix to it

2018-05-16 Thread GitBox
szha commented on issue #10972: Gluon's PReLU is very slow and a fix to it
URL: 
https://github.com/apache/incubator-mxnet/issues/10972#issuecomment-389624639
 
 
   These two implementations are different in terms of the number of 
parameters. The performance hit likely comes from the broadcast operation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sxksxy commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.

2018-05-16 Thread GitBox
sxksxy commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: 
Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.
URL: 
https://github.com/apache/incubator-mxnet/issues/10964#issuecomment-389623976
 
 
   thanks for your reply! @asmushetzel @piiswrong 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188728964
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot 
more computations in parallel that CPUs. But even with the modern GPUs, it 
could take several days to train big models. Training can be done faster by 
using multiple GPUs like described in 
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 tutorial. However only a certain number of GPUs can be attached to one host 
(typically 8 or 16). To make the training even faster, we can use multiple GPUs 
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host 
distributed training.
+
+![Multiple GPUs connected to multiple 
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves 
splitting the training data across GPUs attached to multiple hosts. Since the 
hosts are working with different subset of the training data in parallel, the 
training completes lot faster.
+
+In this tutorial, we will train a LeNet network using MNIST dataset using two 
hosts each having four GPUs.
 
 Review comment:
   could we use CIFAR10 instead? Because multi host multi gpu for mnist is a 
bit overkill since it trains in 2-3 seconds on a single GPU already?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188730608
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot 
more computations in parallel that CPUs. But even with the modern GPUs, it 
could take several days to train big models. Training can be done faster by 
using multiple GPUs like described in 
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 tutorial. However only a certain number of GPUs can be attached to one host 
(typically 8 or 16). To make the training even faster, we can use multiple GPUs 
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host 
distributed training.
+
+![Multiple GPUs connected to multiple 
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves 
splitting the training data across GPUs attached to multiple hosts. Since the 
hosts are working with different subset of the training data in parallel, the 
training completes lot faster.
+
+In this tutorial, we will train a LeNet network using MNIST dataset using two 
hosts each having four GPUs.
+
+## Distributed Training Architecture:
+
+Multihost distributed training involves working with three different types of 
processes - worker, parameter server and scheduler.
+
+![Distributed training 
architecture](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/dist_train_arch.png)
+
+### Parameter Server:
+The parameters of the model needs to be shared with all hosts since multiple 
hosts are working together to train one model. To make this sharing efficient, 
the parameters are split across multiple hosts. A parameter server in each host 
stores a subset of parameters. In the figure above, parameters are split evenly 
between the two hosts. At the end of every iteration, each host communicates 
with every other host to update all parameters of the model.
+
+### Worker:
+Each host has a worker process which in each iteration fetches a batch of 
data, runs forward and backward pass on all GPUs in the host, computes the 
parameter updates and sends those updates to the parameter servers in each 
host. Since we have multiple workers to train the model, each worker only needs 
to process 1/N part of the training data where N is the number of workers.
+
+### Scheduler:
+Scheduler is responsible for scheduling the workers and parameter servers. 
There is only one scheduler in the entire cluster.
+
+## Moving to distributed training:
+
+[mnist_dist.py](mnist_dist.py) contains code that trains a LeNet network using 
distributed training. In this section we'll walk through parts of that file 
that are unique to distributed training.
+
+### Step 1: Use a distributed key-value store:
+
+Like mentioned above, in distributed training, parameters are split into N 
parts and distributed across N hosts. This is done automatically by the 
[distributed key-value 
store](https://mxnet.incubator.apache.org/tutorials/python/kvstore.html). User 
only needs to create the distributed kv store and ask the `Trainer` to use the 
created store.
+
+```python
+store = mxnet.kv.create('dist')
+```
+
+It is the job of the trainer to take the gradients computed in the backward 
pass and update the parameters of the model. We'll tell the trainer to store 
and update the parameters in the distributed kv store we just created instead 
of doing it in GPU of CPU memory. For example,
+
+```python
+trainer = gluon.Trainer(net.collect_params(),
+'sgd', {'learning_rate': .1},
+kvstore=store)
+```
+
+## Step 2: Split the training data:
+
+In distributed training (using data parallelism), training data is split into 
equal parts across all workers and each worker uses its subset of the training 
data for training. For example, if we had two machines, each running a worker, 
each worker managing four GPUs we'll split the data like shown below. Note that 
we don't split the data depending on the number of GPUs but split it depending 
on the number of workers.
+
+![Splitting 
data](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/split_data.png)
+
+Each worker can find out the total number of workers in the cluster and its 
own rank which is an integer between 0 and N-1 where N is the number of workers.
+
+```python
+store = kv.create('dist')
+print("Total number of workers: %d" % store.num_workers)
+print("This worker's rank: %d" % store.rank)
+```
+
+Knowing the number of workers and a particular worker's rank, it is easy to 
split the 

[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188728690
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot 
more computations in parallel that CPUs. But even with the modern GPUs, it 
could take several days to train big models. Training can be done faster by 
using multiple GPUs like described in 
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 tutorial. However only a certain number of GPUs can be attached to one host 
(typically 8 or 16). To make the training even faster, we can use multiple GPUs 
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host 
distributed training.
+
+![Multiple GPUs connected to multiple 
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves 
splitting the training data across GPUs attached to multiple hosts. Since the 
hosts are working with different subset of the training data in parallel, the 
training completes lot faster.
 
 Review comment:
   a lot faster*


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed pull request #10940: [WIP] Fused LSTM Cell

2018-05-16 Thread GitBox
szha closed pull request #10940: [WIP] Fused LSTM Cell
URL: https://github.com/apache/incubator-mxnet/pull/10940
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/mxnet/gluon/rnn/rnn_cell.py 
b/python/mxnet/gluon/rnn/rnn_cell.py
index 281aba45257..c5c929fb8a0 100644
--- a/python/mxnet/gluon/rnn/rnn_cell.py
+++ b/python/mxnet/gluon/rnn/rnn_cell.py
@@ -297,6 +297,63 @@ def __init__(self, prefix=None, params=None):
 def hybrid_forward(self, F, x, *args, **kwargs):
 raise NotImplementedError
 
+class _FusedBaseRNNCell(HybridRecurrentCell): # pylint: disable=abstract-method
+"""Implementation of recurrent layers."""
+def __init__(self, hidden_size, input_size,
+ i2h_weight_initializer, h2h_weight_initializer,
+ i2h_bias_initializer, h2h_bias_initializer,
+ mode, **kwargs):
+super(_FusedBaseRNNCell, self).__init__(**kwargs)
+self._hidden_size = hidden_size
+self._input_size = input_size
+self._mode = mode
+num_gates = {'rnn_relu': 1, 'rnn_tanh': 1, 'lstm': 4, 'gru': 3}[mode]
+self._gates = num_gates
+self.i2h_weight = self.params.get('i2h_weight', 
shape=(num_gates*hidden_size, input_size),
+  init=i2h_weight_initializer,
+  allow_deferred_init=True)
+self.h2h_weight = self.params.get('h2h_weight', 
shape=(num_gates*hidden_size, hidden_size),
+  init=h2h_weight_initializer,
+  allow_deferred_init=True)
+self.i2h_bias = self.params.get('i2h_bias', 
shape=(num_gates*hidden_size,),
+init=i2h_bias_initializer,
+allow_deferred_init=True)
+self.h2h_bias = self.params.get('h2h_bias', 
shape=(num_gates*hidden_size,),
+init=h2h_bias_initializer,
+allow_deferred_init=True)
+
+def hybrid_forward(self, F, inputs, states, i2h_weight,
+   h2h_weight, i2h_bias, h2h_bias):
+prefix = 't%d_'%self._counter
+params = F.concat(i2h_weight.reshape((-1,)),
+  h2h_weight.reshape((-1,)),
+  i2h_bias.reshape((-1,)),
+  h2h_bias.reshape((-1,)), dim=0)
+states = [s.reshape((-4, 1, -1, 0)) for s in states]
+rnn = F.RNN(inputs.reshape((-4, 1, -1, 0)), params, *states, 
state_size=self._hidden_size,
+num_layers=1, state_outputs=True, mode=self._mode, 
name=prefix+'fused')
+
+if self._mode == 'lstm':
+name_suffix = ['out', 'out', 'state']
+rnn = [rnn[i].reshape((-3, -2), name=prefix+suffix) for i, suffix
+   in zip(range(3), name_suffix)]
+outputs, states = rnn[0], [rnn[1], rnn[2]]
+else:
+name_suffix = ['out', 'out']
+rnn = [rnn[i].reshape((-3, -2), name=prefix+suffix) for i, suffix
+   in zip(range(2), name_suffix)]
+outputs, states = rnn[0], [rnn[1]]
+
+return outputs, states
+
+def __repr__(self):
+s = '{name}({mapping})'
+shape = self.i2h_weight.shape
+mapping = '{0} -> {1}'.format(shape[1] if shape[1] else None, shape[0] 
// self._gates)
+return s.format(name=self.__class__.__name__,
+mapping=mapping,
+**self.__dict__)
+
 
 class RNNCell(HybridRecurrentCell):
 r"""Elman RNN recurrent neural network cell.
@@ -398,7 +455,7 @@ def hybrid_forward(self, F, inputs, states, i2h_weight,
 return output, [output]
 
 
-class LSTMCell(HybridRecurrentCell):
+class LSTMCell(_FusedBaseRNNCell):
 r"""Long-Short Term Memory (LSTM) network cell.
 
 Each call computes the following function:
@@ -457,22 +514,13 @@ def __init__(self, hidden_size,
  i2h_weight_initializer=None, h2h_weight_initializer=None,
  i2h_bias_initializer='zeros', h2h_bias_initializer='zeros',
  input_size=0, prefix=None, params=None):
-super(LSTMCell, self).__init__(prefix=prefix, params=params)
-
-self._hidden_size = hidden_size
-self._input_size = input_size
-self.i2h_weight = self.params.get('i2h_weight', shape=(4*hidden_size, 
input_size),
-  init=i2h_weight_initializer,
-  allow_deferred_init=True)
-self.h2h_weight = self.params.get('h2h_weight', shape=(4*hidden_size, 
hidden_size),
-  

[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188729724
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot 
more computations in parallel that CPUs. But even with the modern GPUs, it 
could take several days to train big models. Training can be done faster by 
using multiple GPUs like described in 
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 tutorial. However only a certain number of GPUs can be attached to one host 
(typically 8 or 16). To make the training even faster, we can use multiple GPUs 
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host 
distributed training.
+
+![Multiple GPUs connected to multiple 
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves 
splitting the training data across GPUs attached to multiple hosts. Since the 
hosts are working with different subset of the training data in parallel, the 
training completes lot faster.
+
+In this tutorial, we will train a LeNet network using MNIST dataset using two 
hosts each having four GPUs.
+
+## Distributed Training Architecture:
+
+Multihost distributed training involves working with three different types of 
processes - worker, parameter server and scheduler.
+
+![Distributed training 
architecture](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/dist_train_arch.png)
+
+### Parameter Server:
+The parameters of the model needs to be shared with all hosts since multiple 
hosts are working together to train one model. To make this sharing efficient, 
the parameters are split across multiple hosts. A parameter server in each host 
stores a subset of parameters. In the figure above, parameters are split evenly 
between the two hosts. At the end of every iteration, each host communicates 
with every other host to update all parameters of the model.
+
+### Worker:
+Each host has a worker process which in each iteration fetches a batch of 
data, runs forward and backward pass on all GPUs in the host, computes the 
parameter updates and sends those updates to the parameter servers in each 
host. Since we have multiple workers to train the model, each worker only needs 
to process 1/N part of the training data where N is the number of workers.
+
+### Scheduler:
+Scheduler is responsible for scheduling the workers and parameter servers. 
There is only one scheduler in the entire cluster.
+
+## Moving to distributed training:
+
+[mnist_dist.py](mnist_dist.py) contains code that trains a LeNet network using 
distributed training. In this section we'll walk through parts of that file 
that are unique to distributed training.
+
+### Step 1: Use a distributed key-value store:
+
+Like mentioned above, in distributed training, parameters are split into N 
parts and distributed across N hosts. This is done automatically by the 
[distributed key-value 
store](https://mxnet.incubator.apache.org/tutorials/python/kvstore.html). User 
only needs to create the distributed kv store and ask the `Trainer` to use the 
created store.
+
+```python
+store = mxnet.kv.create('dist')
+```
+
+It is the job of the trainer to take the gradients computed in the backward 
pass and update the parameters of the model. We'll tell the trainer to store 
and update the parameters in the distributed kv store we just created instead 
of doing it in GPU of CPU memory. For example,
+
+```python
+trainer = gluon.Trainer(net.collect_params(),
+'sgd', {'learning_rate': .1},
+kvstore=store)
+```
+
+## Step 2: Split the training data:
+
+In distributed training (using data parallelism), training data is split into 
equal parts across all workers and each worker uses its subset of the training 
data for training. For example, if we had two machines, each running a worker, 
each worker managing four GPUs we'll split the data like shown below. Note that 
we don't split the data depending on the number of GPUs but split it depending 
on the number of workers.
+
+![Splitting 
data](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/split_data.png)
+
+Each worker can find out the total number of workers in the cluster and its 
own rank which is an integer between 0 and N-1 where N is the number of workers.
+
+```python
+store = kv.create('dist')
+print("Total number of workers: %d" % store.num_workers)
+print("This worker's rank: %d" % store.rank)
 
 Review comment:
   it would be nice to have an example of the output of these functions in the 

[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188728190
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
 
 Review comment:
   Please move this to `docs/tutorials` and change this `README.md` to refer 
this file instead


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188730509
 
 

 ##
 File path: example/distributed_training/README.md
 ##
 @@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot 
more computations in parallel that CPUs. But even with the modern GPUs, it 
could take several days to train big models. Training can be done faster by 
using multiple GPUs like described in 
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 tutorial. However only a certain number of GPUs can be attached to one host 
(typically 8 or 16). To make the training even faster, we can use multiple GPUs 
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host 
distributed training.
+
+![Multiple GPUs connected to multiple 
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves 
splitting the training data across GPUs attached to multiple hosts. Since the 
hosts are working with different subset of the training data in parallel, the 
training completes lot faster.
+
+In this tutorial, we will train a LeNet network using MNIST dataset using two 
hosts each having four GPUs.
+
+## Distributed Training Architecture:
+
+Multihost distributed training involves working with three different types of 
processes - worker, parameter server and scheduler.
+
+![Distributed training 
architecture](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/dist_train_arch.png)
+
+### Parameter Server:
+The parameters of the model needs to be shared with all hosts since multiple 
hosts are working together to train one model. To make this sharing efficient, 
the parameters are split across multiple hosts. A parameter server in each host 
stores a subset of parameters. In the figure above, parameters are split evenly 
between the two hosts. At the end of every iteration, each host communicates 
with every other host to update all parameters of the model.
+
+### Worker:
+Each host has a worker process which in each iteration fetches a batch of 
data, runs forward and backward pass on all GPUs in the host, computes the 
parameter updates and sends those updates to the parameter servers in each 
host. Since we have multiple workers to train the model, each worker only needs 
to process 1/N part of the training data where N is the number of workers.
+
+### Scheduler:
+Scheduler is responsible for scheduling the workers and parameter servers. 
There is only one scheduler in the entire cluster.
+
+## Moving to distributed training:
+
+[mnist_dist.py](mnist_dist.py) contains code that trains a LeNet network using 
distributed training. In this section we'll walk through parts of that file 
that are unique to distributed training.
+
+### Step 1: Use a distributed key-value store:
+
+Like mentioned above, in distributed training, parameters are split into N 
parts and distributed across N hosts. This is done automatically by the 
[distributed key-value 
store](https://mxnet.incubator.apache.org/tutorials/python/kvstore.html). User 
only needs to create the distributed kv store and ask the `Trainer` to use the 
created store.
+
+```python
+store = mxnet.kv.create('dist')
+```
+
+It is the job of the trainer to take the gradients computed in the backward 
pass and update the parameters of the model. We'll tell the trainer to store 
and update the parameters in the distributed kv store we just created instead 
of doing it in GPU of CPU memory. For example,
+
+```python
+trainer = gluon.Trainer(net.collect_params(),
+'sgd', {'learning_rate': .1},
+kvstore=store)
+```
+
+## Step 2: Split the training data:
+
+In distributed training (using data parallelism), training data is split into 
equal parts across all workers and each worker uses its subset of the training 
data for training. For example, if we had two machines, each running a worker, 
each worker managing four GPUs we'll split the data like shown below. Note that 
we don't split the data depending on the number of GPUs but split it depending 
on the number of workers.
+
+![Splitting 
data](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/split_data.png)
+
+Each worker can find out the total number of workers in the cluster and its 
own rank which is an integer between 0 and N-1 where N is the number of workers.
+
+```python
+store = kv.create('dist')
+print("Total number of workers: %d" % store.num_workers)
+print("This worker's rank: %d" % store.rank)
+```
+
+Knowing the number of workers and a particular worker's rank, it is easy to 
split the 

[GitHub] ThomasDelteil commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10955: [MXNET-422] 
Distributed training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r188727936
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -38,6 +38,7 @@ Select API:
 * [Visual Question 
Answering](http://gluon.mxnet.io/chapter08_computer-vision/visual-question-answer.html)
 https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg;
 alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
 * Practitioner Guides
 * [Multi-GPU 
training](http://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg;
 alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
+* [Distributed 
Training](https://github.com/apache/incubator-mxnet/tree/master/example/distributed_training)
 
 Review comment:
   Not a big fan to link to the github repo. I understand the tutorial is not 
runnable but I would prefer if it was hosted in the tutorials/docs/gluon to 
avoid fragmentation. Add the right import statements, and put in 
   ```
   runnable: 
   ```python 
   store = mxnet.kv.create('dist')
   ```
   not meant to be run: 
   ```
for batch in train_data:
train_batch(batch, ctx, net, trainer)
   ```
   ```
   That way the code you are using is tested against the CI for correctness.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10940: [WIP] Fused LSTM Cell

2018-05-16 Thread GitBox
szha commented on issue #10940: [WIP] Fused LSTM Cell
URL: https://github.com/apache/incubator-mxnet/pull/10940#issuecomment-389623496
 
 
   Considering the trade-off with flexibility, I think the performance gain 
(even if I fix the loss of performance on parameter concatenation) will be 
marginal. I'm dropping this change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asmushetzel commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.

2018-05-16 Thread GitBox
asmushetzel commented on issue #10964: linalg_potri() Error: linalg_impl.h:606: 
Check failed: ret == 0 (3 vs. 0) spotri failed in lapack on cpu.
URL: 
https://github.com/apache/incubator-mxnet/issues/10964#issuecomment-389622020
 
 
   potri computes the inverse based on the Cholesky decomposition (see 
documentation of potri). So the input must be a positive definite matrix say 
"A".  L = potrf(A) then does the Cholesky decomposition where A = L*L^T.  
potri(L) will compute the inverse of A. 
   You have to ensure that the initial matrix A is indeed positive definite 
when feeding it into potrf().  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu closed issue #10960: CI bug R-CPU no space left on device

2018-05-16 Thread GitBox
marcoabreu closed issue #10960: CI bug R-CPU no space left on device
URL: https://github.com/apache/incubator-mxnet/issues/10960
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marcoabreu commented on issue #10960: CI bug R-CPU no space left on device

2018-05-16 Thread GitBox
marcoabreu commented on issue #10960: CI bug R-CPU no space left on device
URL: 
https://github.com/apache/incubator-mxnet/issues/10960#issuecomment-389619596
 
 
   This was before autoscaling. This is unrelated - our docker images filled up 
the disk. Should be solved now


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188721635
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on issue #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on issue #10900: [MXNET-414] Tutorial on visualizing 
CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#issuecomment-389617143
 
 
   Awesome tutorial @indhub! Makes me wonder if the visualization could make 
its way to the contrib package at some point rather than example folder :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188720295
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188724209
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188717579
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
 
 Review comment:
   VGG-16 is about 500MB, this can be a big drag on the CI. It would be helpful 
if you moved it to Resnet-18 or MobileNet or DenseNet which are <50MB. 
   However not sure how much data you are downloading since you are picking 
specific layers. If it is <100MB then I think it is fine to keep VGG.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188720460
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188721774
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188716879
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
 
 Review comment:
   Suggesting: *However it remains hard to explain the predictions of 
convolutional neural networks*, as they lack the interpretability offered by 
other models, for example decision trees. 
   
   To help ground the reader into the subject we are talking about.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188722441
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188723058
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+

[GitHub] ThomasDelteil commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10900: [MXNET-414] 
Tutorial on visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r188718174
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
 
 Review comment:
   remember to switch that to `apache/mxnet-incubator` after it is merged, or 
consider creating a separate PR to merge first this file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >