[GitHub] [incubator-mxnet] aaronpmishkin opened a new issue #15958: Potential Bug using nd.tile Between Convolutional Layers

GitBox Wed, 21 Aug 2019 01:58:27 -0700

aaronpmishkin opened a new issue #15958: Potential Bug using nd.tile Between 
Convolutional Layers
URL: https://github.com/apache/incubator-mxnet/issues/15958
 
 
   ## Description
   I’ve encountered an issue using `mxnet.ndarray.tile` to upscale CNN 
embeddings so that they may be used as features in another convolutional map. 
First, embeddings are computed for a context set using a CNN; these embeddings 
are then tiled to match their original shape, appended as channels to a query 
set, and passed through another CNN. Cross-posted on the forum 
[here](https://discuss.mxnet.io/t/potential-bug-using-nd-tile-after-convolutional-layers/4688).
   
   ## Environment info (Required)
   ```
   ----------Python Info----------
   Version      : 3.6.8
   Compiler     : GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)
   Build        : ('default', 'Dec 29 2018 19:04:46')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 19.2.1
   Directory    : /Users/user/virtenvs/venv/lib/python3.6/site-packages/pip
   ----------MXNet Info-----------
   
/Users/user/virtenvs/venv/lib/python3.6/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47:
 DeprecationWarning: the imp module is deprecated in favour of importlib; see 
the module's documentation for alternative uses
     import imp
   Version      : 1.3.1
   Directory    : /Users/user/virtenvs/venv/lib/python3.6/site-packages/mxnet
   Commit Hash   : 19c501680183237d52a862e6ae1dc4ddc296305b
   Library      : 
['/Users/user/virtenvs/venv/lib/python3.6/site-packages/mxnet/libmxnet.so']
   Build features:
   No runtime build feature info available
   ----------System Info----------
   Platform     : Darwin-17.7.0-x86_64-i386-64bit
   system       : Darwin
   node         : 38f9d36df1e1.ant.amazon.com
   release      : 17.7.0
   version      : Darwin Kernel Version 17.7.0: Sun Jun  2 20:31:42 PDT 2019; 
root:xnu-4570.71.46~1/RELEASE_X86_64
   ----------Hardware Info----------
   machine      : x86_64
   processor    : i386
   b'machdep.cpu.brand_string: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz'
   b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE 
MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ 
DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC 
MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
   b'machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE 
AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT SGX FPU_CSDS MPX CLFSOPT MD_CLEAR 
TSXFA IBRS STIBP L1DF SSBD'
   b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW 
RDTSCP TSCI'
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0023 
sec, LOAD: 0.7636 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0007 sec, LOAD: 
0.8339 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0007 sec, LOAD: 
0.8172 sec.
   Timing for FashionMNIST: 
https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz,
 DNS: 0.0005 sec, LOAD: 0.8144 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0007 sec, LOAD: 
0.2943 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0006 sec, 
LOAD: 0.0722 sec.
   ```
   
   ## Error Message:
   ```
   ---------------------------------------------------------------------------
   MXNetError                                Traceback (most recent call last)
   <ipython-input-7-793535261150> in <module>
   ----> 1 loss.asscalar()
   
   ~/virtenvs/venv_syne/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py in 
asscalar(self)
      1988         if self.shape != (1,):
      1989             raise ValueError("The current array is not a scalar")
   -> 1990         return self.asnumpy()[0]
      1991 
      1992     def astype(self, dtype, copy=True):
   
   ~/virtenvs/venv_syne/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py in 
asnumpy(self)
      1970             self.handle,
      1971             data.ctypes.data_as(ctypes.c_void_p),
   -> 1972             ctypes.c_size_t(data.size)))
      1973         return data
      1974 
   
   ~/virtenvs/venv_syne/lib/python3.6/site-packages/mxnet/base.py in 
check_call(ret)
       249     """
       250     if ret != 0:
   --> 251         raise MXNetError(py_str(_LIB.MXGetLastError()))
       252 
       253 
   
   MXNetError: [10:21:09] src/operator/nn/../tensor/broadcast_reduce_op.h:408: 
Too many reduction axes from [100,1,1,6,2,14,2,14] to [1,1,1,6,1,14,1,14]
   
   Stack trace returned 10 entries:
   [bt] (0) 0   libmxnet.so                         0x000000010cd59b90 
libmxnet.so + 15248
   [bt] (1) 1   libmxnet.so                         0x000000010cd5993f 
libmxnet.so + 14655
   [bt] (2) 2   libmxnet.so                         0x000000010cd59569 
libmxnet.so + 13673
   [bt] (3) 3   libmxnet.so                         0x000000010cfc7afb 
libmxnet.so + 2562811
   [bt] (4) 4   libmxnet.so                         0x000000010d18bdf9 
libmxnet.so + 4414969
   [bt] (5) 5   libmxnet.so                         0x000000010e05fd05 
libmxnet.so + 19963141
   [bt] (6) 6   libmxnet.so                         0x000000010e2ae9fd 
MXNDListFree + 561325
   [bt] (7) 7   libmxnet.so                         0x000000010e235904 
MXNDListFree + 65460
   [bt] (8) 8   libmxnet.so                         0x000000010e237fd8 
MXNDListFree + 75400
   [bt] (9) 9   libmxnet.so                         0x000000010e23b371 
MXNDListFree + 88609
   ```
   
   ## Minimum reproducible example
   ```
   ## Setup ##
   
   import numpy as np
   import mxnet as mx
   import mxnet.gluon as gluon
   import mxnet.gluon.nn as nn
   
   embedding_block = nn.Sequential()
   # make a small CNN to embedd the "context"
   embedding_block.add(nn.Conv2D(channels=6, kernel_size=5, strides=1, 
activation='relu', padding=(2,2)))
   embedding_block.add(nn.AvgPool2D(pool_size=(2,2), strides=2))
   
   # make a CNN classifier for the query set
   query_block = nn.Sequential()
   query_block.add(nn.Conv2D(channels=6, kernel_size=5, strides=1, 
activation='relu'))
   query_block.add(nn.AvgPool2D(pool_size=(2,2), strides=2))
   query_block.add(nn.Dense(units=1))
   
   embedding_block.collect_params().initialize()
   query_block.collect_params().initialize()
   
   ## Data Generation ##
   
   # create a simple squared loss problem
   w = np.random.normal(size=(28,28))
   
   
   # features should be multi-channel images
   features = np.random.normal(size=(200, 6, 28, 28))
   temp = np.sum(w * features, axis=(1,2,3))
   targets = np.sign(np.add(temp[:100], temp[100:]))
   context_features = mx.nd.array(features[:100])
   query_features = mx.nd.array(features[100:])
   targets = mx.nd.array(targets)
   
   ## Data Generation ##
   
   loss = 0.
   with mx.autograd.record():
       # Add features via nd.tile
       context_embedding = mx.nd.sum(embedding_block(context_features), axis=0)
       channel = context_embedding.tile((100, 1, 2, 2))
       # append new channel to image features
       task_features = mx.nd.concat(query_features, channel)
   
       preds = query_block(task_features)
       loss = loss + mx.nd.sum(mx.nd.square(mx.nd.squeeze(preds) - targets))
       
   loss.backward()
   
   loss.asscalar()
   ```
   
   ## Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1. The error is thrown by the above code when `loss.asscalar()` is called.
   
   ## What have you tried to solve it?
   
   1. As far as I know, this error is only thrown when `context_embedding` is 
computed using convolutional layers. I initially tried `nd.tile` on a 
pre-defined nd.array and was not able to replicate the issue. 
   2. The error is also not thrown if the context embeddings are not tiled 
(e.g. when they are already the same size as the query image channels).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] aaronpmishkin opened a new issue #15958: Potential Bug using nd.tile Between Convolutional Layers

Reply via email to