date:20171209

[GitHub] masahi commented on issue #6773: Deadlock and crashes during shutdown

2017-12-09 Thread GitBox

masahi commented on issue #6773: Deadlock and crashes during shutdown
URL: https://github.com/apache/incubator-mxnet/pull/6773#issuecomment-350529830
 
 
   @lialie @cjolivier01 We are seeing the same problem with Windows 7. 
   On Win 8 and 10, there is no issue during shutdown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] reminisce commented on issue #9007: float16 argmax breaks on negative inputs

2017-12-09 Thread GitBox

reminisce commented on issue #9007: float16 argmax breaks on negative inputs
URL: 
https://github.com/apache/incubator-mxnet/issues/9007#issuecomment-350529028
 
 
   Unary reduce ops have the problem of handling float16 correctly. For example,
   ```python
   a = mx.nd.array([-2, 0], dtype=np.float16)
   print(mx.nd.max(a))
   [  6.10351562e-05]
   
   ```
   
   I'm guessing it might be related to setting initial minimum value for 
float16 type.
   
https://github.com/dmlc/mshadow/blob/2d7780c3f2eefe4453fa419862d1b2089bedb8d5/mshadow/extension/reduce_with_axis.h#L121


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] reminisce commented on issue #9007: float16 argmax breaks on negative inputs

2017-12-09 Thread GitBox

reminisce commented on issue #9007: float16 argmax breaks on negative inputs
URL: 
https://github.com/apache/incubator-mxnet/issues/9007#issuecomment-350527843
 
 
   Looks like a bug of the `argmax` operator itself, not about `simple_bind`, 
because the following test on `maximum` op could generate correct result. Will 
dig deeper.
   ```python
   data1 = mx.sym.Variable('data1')
   data2 = mx.sym.Variable('data2')
   sym = mx.sym.maximum(data1, data2)
   exe = sym.simple_bind(ctx=mx.cpu(), data1=(1,), type_dict={'data1': 
np.float16, 'data2': np.float16})
   exe.forward(is_train=True, data1=np.array([-3], dtype=np.float16), 
data2=np.array([-4], dtype=np.float16))
   print(exe.arg_dict['data1'].dtype)
   print(exe.arg_dict['data2'].dtype)
   print(exe.outputs[0])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 commented on a change in pull request #8938: Add operator for dot(dns, csr) = csr

2017-12-09 Thread GitBox

anirudh2290 commented on a change in pull request #8938: Add operator for 
dot(dns, csr) = csr
URL: https://github.com/apache/incubator-mxnet/pull/8938#discussion_r155939068
 
 

 ##
 File path: src/operator/tensor/dot-inl.h
 ##
 @@ -811,6 +891,94 @@ inline void DotCsrRspRspImpl(const OpContext& ctx,
   });
 }
 
+/*
+ * \brief CPU Impl of dot(dns, csr) = csr
+ */
+inline void DotDnsCsrCsrImpl(const OpContext& ctx, const cpu& cpu_dev,
+ const TBlob& lhs, const NDArray& rhs,
+ const OpReqType req, NDArray* ret) {
+  if (kNullOp == req) return;
+  CHECK_EQ(rhs.storage_type(), kCSRStorage);
+  if (!rhs.storage_initialized()) return;
 
 Review comment:
   Fixed!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 commented on a change in pull request #8938: Add operator for dot(dns, csr) = csr

2017-12-09 Thread GitBox

anirudh2290 commented on a change in pull request #8938: Add operator for 
dot(dns, csr) = csr
URL: https://github.com/apache/incubator-mxnet/pull/8938#discussion_r155939071
 
 

 ##
 File path: src/operator/tensor/dot-inl.h
 ##
 @@ -231,6 +231,12 @@ inline bool DotForwardInferStorageType(const 
nnvm::NodeAttrs& attrs,
 dispatched = storage_type_assign(_stype, kDefaultStorage,
  dispatch_mode, DispatchMode::kFComputeEx);
   }
+  if (!dispatched && lhs_stype == kDefaultStorage && rhs_stype == kCSRStorage 
&&
 
 Review comment:
   I have added a check for CPU. will fallback to default storage for gpu


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 commented on a change in pull request #8938: Add operator for dot(dns, csr) = csr

2017-12-09 Thread GitBox

anirudh2290 commented on a change in pull request #8938: Add operator for 
dot(dns, csr) = csr
URL: https://github.com/apache/incubator-mxnet/pull/8938#discussion_r155939067
 
 

 ##
 File path: src/operator/tensor/dot-inl.h
 ##
 @@ -811,6 +891,94 @@ inline void DotCsrRspRspImpl(const OpContext& ctx,
   });
 }
 
+/*
+ * \brief CPU Impl of dot(dns, csr) = csr
+ */
+inline void DotDnsCsrCsrImpl(const OpContext& ctx, const cpu& cpu_dev,
+ const TBlob& lhs, const NDArray& rhs,
+ const OpReqType req, NDArray* ret) {
+  if (kNullOp == req) return;
+  CHECK_EQ(rhs.storage_type(), kCSRStorage);
+  if (!rhs.storage_initialized()) return;
+
+  using namespace mshadow;
+  using namespace mshadow::expr;
+  using nnvm::dim_t;
+
+  /*Initialize data structures*/
+  mshadow::Stream* s = ctx.get_stream();
+  const NDArray& out = *ret;
+  const TBlob data_l = lhs;
+  const TBlob data_r = rhs.data();
+  const TBlob indptr_r = rhs.aux_data(csr::kIndPtr);
+  const TBlob col_idx_r = rhs.aux_data(csr::kIdx);
+
+  MSHADOW_SGL_DBL_TYPE_SWITCH(data_r.type_flag_, DType, { // data type
+MSHADOW_IDX_TYPE_SWITCH(indptr_r.type_flag_, IType, { // indptr type
+  MSHADOW_IDX_TYPE_SWITCH(col_idx_r.type_flag_, CType, {  // colidx type
+/* Allocate workspace */
+CType num_cols_out = out.shape()[1];
+CType rhs_data_size = static_cast(col_idx_r.shape_.Size());
+size_t workspace_size = 2 * num_cols_out * sizeof(CType);
+Tensor workspace =
+ctx.requested[0].get_space_typed(
+Shape1(workspace_size), s);
+CType* col_flg = reinterpret_cast(workspace.dptr_);
+
+CType* prefix_sum = col_flg;
+CType* nnc_idx = prefix_sum + num_cols_out;
+
+/* Set the column flags for nnz columns */
+mxnet_op::Kernel::Launch(s, num_cols_out,
+  col_flg);
+mxnet_op::Kernel::Launch(
+s, rhs_data_size, col_flg, col_idx_r.dptr());
+
+/* 1. Calculate prefix sum from col flgs
+ * 2. Storage all non zero column indexes in nnc_idx
+ */
+CType cur = 0;
+prefix_sum[0] = col_flg[0];
+if (prefix_sum[0]) nnc_idx[cur++] = 0;
+for (CType i = 1; i < num_cols_out; i++) {
+  prefix_sum[i] = prefix_sum[i - 1] + col_flg[i];
+  if (prefix_sum[i] > prefix_sum[i - 1]) nnc_idx[cur++] = i;
+}
+
+/* Allocate aux data for out */
+IType num_rows_l = lhs.shape_[0];
+dim_t nnc = prefix_sum[num_cols_out - 1];
+dim_t nnz = nnc * num_rows_l;
+out.CheckAndAllocAuxData(csr::kIndPtr, Shape1(num_rows_l + 1));
+out.CheckAndAllocAuxData(csr::kIdx, Shape1(nnz));
+out.CheckAndAllocData(Shape1(nnz));
+
+/* Set csr indptr and index according to nnc_idx*/
+IType* indptr_out = out.aux_data(csr::kIndPtr).dptr();
+CType* col_idx_out = out.aux_data(csr::kIdx).dptr();
+DType* data_out = out.data().dptr();
+mxnet_op::Kernel::Launch(
+s, num_rows_l, nnc_idx, indptr_out, col_idx_out, nnc, num_rows_l);
+mxnet_op::Kernel::Launch(s, nnz, data_out);
+
+if (nnc == 0) {
 
 Review comment:
   Why should nnc never be 0 ? This is possible when number of non zero columns 
are zero(matrix with all zeros) in the rhs. In this case we return the output 
correctly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune 
and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#issuecomment-350523576
 
 
   In latest commits, all of the stuff in cpp unit test files edits due to 
conflicts rebasing from master. No functional changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155937321
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155937212
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune 
and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#issuecomment-350520637
 
 
   Ok, I thought of a way to simplify some of the API's


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ArmageddonKnight opened a new issue #9010: get_next_state parameter in FusedRNN constructor

2017-12-09 Thread GitBox

ArmageddonKnight opened a new issue #9010: get_next_state parameter in FusedRNN 
constructor
URL: https://github.com/apache/incubator-mxnet/issues/9010
 
 
   Sorry but I am confused about the `get_next_state` parameter in the 
https://github.com/apache/incubator-mxnet/blob/3ee1ca272964a8bfc1ca40a65c3341c58352f4a4/python/mxnet/rnn/rnn_cell.py#L536
 It says:
   
   get_next_state : bool, default False
   Whether to return the states that can be used as starting states 
next time.
   
   Could someone please explain under what circumstance would we set this 
parameter to `true`? Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936371
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936352
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936277
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936254
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936128
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
 
 Review comment:
   resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

astonzhang commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155936104
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
 
 Review comment:
   resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] cjolivier01 commented on a change in pull request #8989: Symbol getitem using list_outputs() is too expensive

2017-12-09 Thread GitBox

cjolivier01 commented on a change in pull request #8989: Symbol __getitem__ 
using list_outputs() is too expensive
URL: https://github.com/apache/incubator-mxnet/pull/8989#discussion_r155935864
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1051,6 +1051,16 @@ MXNET_DLL int MXSymbolListArguments(SymbolHandle symbol,
 MXNET_DLL int MXSymbolListOutputs(SymbolHandle symbol,
   mx_uint *out_size,
   const char ***out_str_array);
+
+/*!
+ * \brief Get number of outputs of the symbol.
+ * \param symbol The symbol
+ * \param out_size number of outputs
+ * \return 0 when success, -1 when failure happens
+ */
+MXNET_DLL int MXSymbolGetOutputCount(SymbolHandle symbol,
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] cjolivier01 commented on a change in pull request #8989: Symbol getitem using list_outputs() is too expensive

2017-12-09 Thread GitBox

cjolivier01 commented on a change in pull request #8989: Symbol __getitem__ 
using list_outputs() is too expensive
URL: https://github.com/apache/incubator-mxnet/pull/8989#discussion_r155935848
 
 

 ##
 File path: python/mxnet/symbol/symbol.py
 ##
 @@ -745,6 +747,25 @@ def list_outputs(self):
 self.handle, ctypes.byref(size), ctypes.byref(sarr)))
 return [py_str(sarr[i]) for i in range(size.value)]
 
+def output_count(self):
 
 Review comment:
   I suppose one could use that convention. I can make it that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on issue #7804: Enhance For DataLoad Directly from folder

2017-12-09 Thread GitBox

szha commented on issue #7804: Enhance For DataLoad Directly from folder
URL: 
https://github.com/apache/incubator-mxnet/issues/7804#issuecomment-350515880
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on issue #7824: How to partition a computing graph among multiple machines?

2017-12-09 Thread GitBox

szha commented on issue #7824: How to partition a computing graph among 
multiple machines?
URL: 
https://github.com/apache/incubator-mxnet/issues/7824#issuecomment-350515881
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on issue #6186: cudnn_deconvolution-inl.h

2017-12-09 Thread GitBox

szha commented on issue #6186: cudnn_deconvolution-inl.h
URL: 
https://github.com/apache/incubator-mxnet/issues/6186#issuecomment-350515878
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha closed issue #7804: Enhance For DataLoad Directly from folder

2017-12-09 Thread GitBox

szha closed issue #7804: Enhance For DataLoad Directly from folder
URL: https://github.com/apache/incubator-mxnet/issues/7804
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha closed issue #7824: How to partition a computing graph among multiple machines?

2017-12-09 Thread GitBox

szha closed issue #7824: How to partition a computing graph among multiple 
machines?
URL: https://github.com/apache/incubator-mxnet/issues/7824
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha closed issue #6186: cudnn_deconvolution-inl.h

2017-12-09 Thread GitBox

szha closed issue #6186: cudnn_deconvolution-inl.h
URL: https://github.com/apache/incubator-mxnet/issues/6186
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bradcar commented on issue #2856: PReLU problem

2017-12-09 Thread GitBox

bradcar commented on issue #2856: PReLU problem
URL: 
https://github.com/apache/incubator-mxnet/issues/2856#issuecomment-350512780
 
 
   Hi taoari, I have been unable to find out how to properly use prelu in 
MXNet: 
   x = F.LeakyReLU(self.conv4(x), act_type='prelu')
   I get this error:
   [[11:04:52] 
/Users/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308:
 [11:04:52] src/c_api/c_api_ndarray.cc:76: Check failed: num_inputs == 
infered_num_inputs (1 vs. 2) Operator LeakyReLU expects 2 inputs, but got 1 
instead.
   
   any insights?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] srochel opened a new pull request #9009: Delete recommendation_systems.md

2017-12-09 Thread GitBox

srochel opened a new pull request #9009: Delete recommendation_systems.md
URL: https://github.com/apache/incubator-mxnet/pull/9009
 
 
   Tutorial for recommender system to be provided with Zach's book.
   
   ## Description ##
   remove file. Tutorial for recommender system to be provided with Zach's book.
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Passed code style checking (`make lint`)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune 
and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#issuecomment-350503085

   I mean, it can get as compact as being a couple of calls that take
   enumerated and optional arguments and then internally all of the
   possibilities get dispatched. Is this preferred? So far I haven?t seen a
   convention of squeezing down APIs like this in mxnet but I have no problem
   doing it ? it?s not hard to do. I was going for API usability and
   readability here.

   On Sat, Dec 9, 2017 at 11:52 AM Chris Olivier  wrote:

   > I suppose crate start and stop can take an enumeration on type, but that
   > gets sort of cryptic.
   >
   > On Sat, Dec 9, 2017 at 10:50 AM Eric Junyuan Xie 
   > wrote:
   >
   >> This adds too many CAPIs is there a better approach?
   >>
   >> ?
   >> You are receiving this because you authored the thread.
   >> Reply to this email directly, view it on GitHub
   >> 
,
   >> or mute the thread
   >> 

   >> .
   >>
   >

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] cjolivier01 commented on a change in pull request #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

cjolivier01 commented on a change in pull request #8972: Profiling 
enhancements, python API, vtune and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#discussion_r155930949
 
 

 ##
 File path: python/mxnet/profiler.py
 ##
 @@ -56,3 +69,237 @@ def dump_profile():
 """Dump profile and stop profiler. Use this to save profile
 in advance in case your program cannot exit normally."""
 check_call(_LIB.MXDumpProfile())
+
+def create_domain(name):
+  domain_handle = ProfileDomainHandle()
+  check_call(_LIB.MXProfileCreateDomain(c_str(name), 
ctypes.byref(domain_handle)))
+  return domain_handle
+
+def create_task(domain_handle, name):
+task_handle = ProfileTaskHandle()
+check_call(_LIB.MXProfileCreateTask(domain_handle,
+c_str(name),
+ctypes.byref(task_handle)))
+return task_handle
+
+def destroy_task(task_handle):
+check_call(_LIB.MXProfileDestroyTask(task_handle))
+
+def task_start(task_handle):
+check_call(_LIB.MXProfileTaskStart(task_handle))
+
+def task_stop(task_handle):
+check_call(_LIB.MXProfileTaskStop(task_handle))
+
+def create_frame(domain_handle, name):
+frame_handle = ProfileFrameHandle()
+check_call(_LIB.MXProfileCreateFrame(domain_handle,
+c_str(name),
+ctypes.byref(frame_handle)))
+return frame_handle
+
+def destroy_frame(frame_handle):
+check_call(_LIB.MXProfileDestroyFrame(frame_handle))
+
+def frame_start(frame_handle):
+check_call(_LIB.MXProfileFrameStart(frame_handle))
+
+def frame_stop(frame_handle):
+check_call(_LIB.MXProfileFrameStop(frame_handle))
+
+def create_event(name):
+event_handle = ProfileEventHandle()
+check_call(_LIB.MXProfileCreateEvent(c_str(name), 
ctypes.byref(event_handle)))
+return event_handle
+
+def destroy_event(event_handle):
+check_call(_LIB.MXProfileDestroyEvent(event_handle))
+
+def event_start(event_handle):
+check_call(_LIB.MXProfileEventStart(event_handle))
+
+def event_stop(event_handle):
+check_call(_LIB.MXProfileEventStop(event_handle))
+
+def tune_pause():
+check_call(_LIB.MXProfileTunePause())
+
+def tune_resume():
+check_call(_LIB.MXProfileTuneResume())
+
+def create_counter(domain_handle, name, value=None):
+counter_handle = ProfileCounterHandle()
+check_call(_LIB.MXProfileCreateCounter(domain_handle,
+   c_str(name),
+   ctypes.byref(counter_handle)))
+if value is not None:
+set_counter(counter_handle, value)
+return counter_handle
+
+def destroy_counter(counter_handle):
+check_call(_LIB.MXProfileDestroyCounter(counter_handle))
+
+def set_counter(counter_handle, value):
+check_call(_LIB.MXProfileSetCounter(counter_handle, int(value)))
+
+def increment_counter(counter_handle, by_value):
+check_call(_LIB.MXProfileAdjustCounter(counter_handle, int(by_value)))
+
+def decrement_counter(counter_handle, by_value):
+check_call(_LIB.MXProfileAdjustCounter(counter_handle, -int(by_value)))
+
+def set_append_mode(mode):
+  if mode is False:
+mode = 0
+  else:
+mode = 1
+  check_call(_LIB.MXSetDumpProfileAppendMode(int(mode)))
+
+def set_continuous_dump(continuous_dump=True, delay_in_seconds=1.0):
+  if continuous_dump is False:
+cd = 0
+  else:
+cd = 1
+  ds = float(delay_in_seconds)
+  check_call(_LIB.MXSetContinuousProfileDump(ctypes.c_int(cd), 
ctypes.c_float(ds)))
+
+def set_instant_marker(domain_handle, name, scope='process'):
+marker_scope2int = { 'global': 1, 'process': 2, 'thread': 3, 'task': 4, 
'marker': 5 }
+scope_int = marker_scope2int[scope]
+check_call(_LIB.MXProfileSetInstantMarker(domain_handle, c_str(name), 
scope_int))
+
+
+class Domain:
+"""Profiling domain, used to group sub-objects like tasks, counters, etc 
into categories
+Serves as part of 'categories' for chrome://tracing
+Note: Domain handles are never destroyed
+"""
+def __init__(self, name):
+self.name = name
+self.handle = create_domain(name)
+
+def __str__(self):
+return self.name
+
+
+class Task:
+"""Profiling Task class
+A task is a logical unit of work performed by a particular thread.
+Tasks can nest; thus, tasks typically correspond to functions, scopes, or 
a case block
+in a switch statement.
+You can use the Task API to assign tasks to threads
+"""
+def __init__(self, domain, name):
+self.domain = domain
+self.name = name
+self.handle = create_task(domain.handle, name)
+
+def start(self):
+task_start(self.handle)
+
+def stop(self):
+task_stop(self.handle)
+
+def __str__(self):
+return self.name
+
+def __del__(self):
+if self.handle is not None:
+destroy_task(self.handle)
+
+

[GitHub] cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

cjolivier01 commented on issue #8972: Profiling enhancements, python API, vtune 
and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#issuecomment-350501238

   I suppose crate start and stop can take an enumeration on type, but that
   gets sort of cryptic.

   On Sat, Dec 9, 2017 at 10:50 AM Eric Junyuan Xie 
   wrote:

   > This adds too many CAPIs is there a better approach?
   >
   > ?
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930401
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930408
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930401
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930353
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930318
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930259
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] szha commented on a change in pull request #8763: [WIP] Add text apis

2017-12-09 Thread GitBox

szha commented on a change in pull request #8763: [WIP] Add text apis
URL: https://github.com/apache/incubator-mxnet/pull/8763#discussion_r155930205
 
 

 ##
 File path: python/mxnet/text/glossary.py
 ##
 @@ -0,0 +1,654 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Read text files and load embeddings."""
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import tarfile
+import zipfile
+
+from ..gluon.utils import check_sha1
+from ..gluon.utils import download
+from .. import ndarray as nd
+
+from tqdm import tqdm
+
+
+class Glossary(object):
+"""Indexing and embedding for text and special tokens in a glossary.
+
+For each indexed text or special token (e.g., an unknown token) in a
+glossary, an embedding vector will be associated with the token. Such
+embedding vectors can be loaded from externally pre-trained embeddings,
+such as via mxnet.text.Embedding instances.
+
+
+Parameters
+--
+counter : collections.Counter
+Counts text token frequencies in the text data.
+top_k_freq : None or int, default None
+The number of top frequent tokens in the keys of `counter` that will be
+indexed. If None, all the tokens in the keys of `counter` will be
+indexed.
+min_freq : int, default 1
+The minimum frequency required for a token in the keys of `counter` to
+be indexed.
+specials : list of strs, default ['']
+A list of special tokens to be indexed. It must be an non-empty list
+whose first element is the string representation for unknown tokens,
+such as ''. It cannot contain any token from the keys of 
`counter`.
+embeds : an mxnet.text.Embedding instance, a list of mxnet.text.Embedding
+ instances, or None, default None
+Pre-trained embeddings to load. If None, there is nothing to load.
+
+
+Properties
+--
+counter : collections.Counter
+Counts text and special token frequencies in the text data, where
+special token frequency is clear to zero.
+token_to_idx : dict mapping str to int
+A dict mapping each token to its index integer.
+idx_to_token : list of strs
+A list of indexed tokens where the list indices and the token indices
+are aligned.
+idx_to_vec : mxnet.ndarray.NDArray
+For all the indexed tokens in this glossary, this NDArray maps each
+token's index to an embedding vector.
+vec_len : int
+The length of the embedding vector for any token.
+specials: list of strs
+A list of special tokens to be indexed. It is an non-empty list whose
+first element is the string representation for unknown tokens, such as
+''. It excludes all the tokens from the keys of `counter`.
+"""
+def __init__(self, counter, top_k_freq=None, min_freq=1,
+ specials=[''], embeds=None):
+# Sanity checks.
+assert min_freq > 0, '`min_freq` must be set to a positive value.'
+assert len(specials) > 0, \
+'`specials` must be an non-empty list whose first element is the ' 
\
+'string representation for unknown tokens, such as "".'
+
+self._init_attrs(counter, specials)
+self._set_idx_and_token(counter, specials, top_k_freq, min_freq)
+
+if embeds is not None:
+self.set_idx_to_vec(embeds)
+
+def _init_attrs(self, counter, specials):
+"""Initiates class attributes."""
+self._counter = counter.copy()
+self._token_to_idx = {token: idx for idx, token in enumerate(specials)}
+self._idx_to_token = specials.copy()
+self._idx_to_vec = None
+self._vec_len = 0
+self._specials = specials.copy()
+
+def _set_idx_and_token(self, counter, specials, top_k_freq, min_freq):
+"""Indexes tokens according to specified frequency thresholds."""
+# Update _counter to include special specials, such as ''.
+self._counter.update({token: 0 for token in specials})
+assert len(self._counter) == len(counter) + len(specials), 'specials ' 
\
+'cannot contain any token

[GitHub] piiswrong commented on issue #9007: float16 argmax breaks on negative inputs

2017-12-09 Thread GitBox

piiswrong commented on issue #9007: float16 argmax breaks on negative inputs
URL: 
https://github.com/apache/incubator-mxnet/issues/9007#issuecomment-350498289
 
 
   @reminisce 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #8989: Symbol getitem using list_outputs() is too expensive

2017-12-09 Thread GitBox

piiswrong commented on a change in pull request #8989: Symbol __getitem__ using 
list_outputs() is too expensive
URL: https://github.com/apache/incubator-mxnet/pull/8989#discussion_r155929444
 
 

 ##
 File path: python/mxnet/symbol/symbol.py
 ##
 @@ -745,6 +747,25 @@ def list_outputs(self):
 self.handle, ctypes.byref(size), ctypes.byref(sarr)))
 return [py_str(sarr[i]) for i in range(size.value)]
 
+def output_count(self):
 
 Review comment:
   Isn't this just `__len__`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #8989: Symbol getitem using list_outputs() is too expensive

2017-12-09 Thread GitBox

piiswrong commented on a change in pull request #8989: Symbol __getitem__ using 
list_outputs() is too expensive
URL: https://github.com/apache/incubator-mxnet/pull/8989#discussion_r155929437
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1051,6 +1051,16 @@ MXNET_DLL int MXSymbolListArguments(SymbolHandle symbol,
 MXNET_DLL int MXSymbolListOutputs(SymbolHandle symbol,
   mx_uint *out_size,
   const char ***out_str_array);
+
+/*!
+ * \brief Get number of outputs of the symbol.
+ * \param symbol The symbol
+ * \param out_size number of outputs
+ * \return 0 when success, -1 when failure happens
+ */
+MXNET_DLL int MXSymbolGetOutputCount(SymbolHandle symbol,
 
 Review comment:
   GetNumOutputs


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on issue #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

piiswrong commented on issue #8972: Profiling enhancements, python API, vtune 
and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#issuecomment-350497570
 
 
   This adds too many CAPIs is there a better approach?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #8972: Profiling enhancements, python API, vtune and chrome tracing objects, etc.

2017-12-09 Thread GitBox

piiswrong commented on a change in pull request #8972: Profiling enhancements, 
python API, vtune and chrome tracing objects, etc.
URL: https://github.com/apache/incubator-mxnet/pull/8972#discussion_r155929416
 
 

 ##
 File path: python/mxnet/profiler.py
 ##
 @@ -56,3 +69,237 @@ def dump_profile():
 """Dump profile and stop profiler. Use this to save profile
 in advance in case your program cannot exit normally."""
 check_call(_LIB.MXDumpProfile())
+
+def create_domain(name):
+  domain_handle = ProfileDomainHandle()
+  check_call(_LIB.MXProfileCreateDomain(c_str(name), 
ctypes.byref(domain_handle)))
+  return domain_handle
+
+def create_task(domain_handle, name):
+task_handle = ProfileTaskHandle()
+check_call(_LIB.MXProfileCreateTask(domain_handle,
+c_str(name),
+ctypes.byref(task_handle)))
+return task_handle
+
+def destroy_task(task_handle):
+check_call(_LIB.MXProfileDestroyTask(task_handle))
+
+def task_start(task_handle):
+check_call(_LIB.MXProfileTaskStart(task_handle))
+
+def task_stop(task_handle):
+check_call(_LIB.MXProfileTaskStop(task_handle))
+
+def create_frame(domain_handle, name):
+frame_handle = ProfileFrameHandle()
+check_call(_LIB.MXProfileCreateFrame(domain_handle,
+c_str(name),
+ctypes.byref(frame_handle)))
+return frame_handle
+
+def destroy_frame(frame_handle):
+check_call(_LIB.MXProfileDestroyFrame(frame_handle))
+
+def frame_start(frame_handle):
+check_call(_LIB.MXProfileFrameStart(frame_handle))
+
+def frame_stop(frame_handle):
+check_call(_LIB.MXProfileFrameStop(frame_handle))
+
+def create_event(name):
+event_handle = ProfileEventHandle()
+check_call(_LIB.MXProfileCreateEvent(c_str(name), 
ctypes.byref(event_handle)))
+return event_handle
+
+def destroy_event(event_handle):
+check_call(_LIB.MXProfileDestroyEvent(event_handle))
+
+def event_start(event_handle):
+check_call(_LIB.MXProfileEventStart(event_handle))
+
+def event_stop(event_handle):
+check_call(_LIB.MXProfileEventStop(event_handle))
+
+def tune_pause():
+check_call(_LIB.MXProfileTunePause())
+
+def tune_resume():
+check_call(_LIB.MXProfileTuneResume())
+
+def create_counter(domain_handle, name, value=None):
+counter_handle = ProfileCounterHandle()
+check_call(_LIB.MXProfileCreateCounter(domain_handle,
+   c_str(name),
+   ctypes.byref(counter_handle)))
+if value is not None:
+set_counter(counter_handle, value)
+return counter_handle
+
+def destroy_counter(counter_handle):
+check_call(_LIB.MXProfileDestroyCounter(counter_handle))
+
+def set_counter(counter_handle, value):
+check_call(_LIB.MXProfileSetCounter(counter_handle, int(value)))
+
+def increment_counter(counter_handle, by_value):
+check_call(_LIB.MXProfileAdjustCounter(counter_handle, int(by_value)))
+
+def decrement_counter(counter_handle, by_value):
+check_call(_LIB.MXProfileAdjustCounter(counter_handle, -int(by_value)))
+
+def set_append_mode(mode):
+  if mode is False:
+mode = 0
+  else:
+mode = 1
+  check_call(_LIB.MXSetDumpProfileAppendMode(int(mode)))
+
+def set_continuous_dump(continuous_dump=True, delay_in_seconds=1.0):
+  if continuous_dump is False:
+cd = 0
+  else:
+cd = 1
+  ds = float(delay_in_seconds)
+  check_call(_LIB.MXSetContinuousProfileDump(ctypes.c_int(cd), 
ctypes.c_float(ds)))
+
+def set_instant_marker(domain_handle, name, scope='process'):
+marker_scope2int = { 'global': 1, 'process': 2, 'thread': 3, 'task': 4, 
'marker': 5 }
+scope_int = marker_scope2int[scope]
+check_call(_LIB.MXProfileSetInstantMarker(domain_handle, c_str(name), 
scope_int))
+
+
+class Domain:
+"""Profiling domain, used to group sub-objects like tasks, counters, etc 
into categories
+Serves as part of 'categories' for chrome://tracing
+Note: Domain handles are never destroyed
+"""
+def __init__(self, name):
+self.name = name
+self.handle = create_domain(name)
+
+def __str__(self):
+return self.name
+
+
+class Task:
+"""Profiling Task class
+A task is a logical unit of work performed by a particular thread.
+Tasks can nest; thus, tasks typically correspond to functions, scopes, or 
a case block
+in a switch statement.
+You can use the Task API to assign tasks to threads
+"""
+def __init__(self, domain, name):
+self.domain = domain
+self.name = name
+self.handle = create_task(domain.handle, name)
+
+def start(self):
+task_start(self.handle)
+
+def stop(self):
+task_stop(self.handle)
+
+def __str__(self):
+return self.name
+
+def __del__(self):
+if self.handle is not None:
+destroy_task(self.handle)
+
+

42 matches

Mail list logo