e8035669 opened a new issue #14184: Cannot infer RNN symbol with mode=RNNMode::kRnn_tanh URL: https://github.com/apache/incubator-mxnet/issues/14184 ## Description (Brief description of the problem in no more than 2 sentences.) I'm using C++ interface. If I wanna create an RNN symbol with C++ factory function in `cpp-package/include/mxnet-cpp/op.h`, the generated function will always set all 4 inputs into the operator. But when I infer the symbols, RNN symbol expect exactly 3 input needed with `mode=RNNMode::kRnn_tanh`, and always fail to create `Executor`. I can create rnn symbol with `Operator` class, this can temporarily solve my problem. I think the shape checking is here. https://github.com/apache/incubator-mxnet/blob/5adb6fcfb3ea50c76d387e63ce367bbe8c3f18d3/src/operator/rnn-inl.h#L673-L677 And factory function in `op.h` is here. ```cpp inline Symbol RNN(const std::string& symbol_name, Symbol data, Symbol parameters, Symbol state, Symbol state_cell, uint32_t state_size, uint32_t num_layers, RNNMode mode, bool bidirectional = false, mx_float p = 0, bool state_outputs = false, dmlc::optional<int> projection_size = dmlc::optional<int>(), dmlc::optional<double> lstm_state_clip_min = dmlc::optional<double>(), dmlc::optional<double> lstm_state_clip_max = dmlc::optional<double>(), bool lstm_state_clip_nan = false) { static const char *RNNModeValues[] = { "gru", "lstm", "rnn_relu", "rnn_tanh" }; return Operator("RNN") .SetParam("state_size", state_size) .SetParam("num_layers", num_layers) .SetParam("mode", RNNModeValues[int(mode)]) .SetParam("bidirectional", bidirectional) .SetParam("p", p) .SetParam("state_outputs", state_outputs) .SetParam("projection_size", projection_size) .SetParam("lstm_state_clip_min", lstm_state_clip_min) .SetParam("lstm_state_clip_max", lstm_state_clip_max) .SetParam("lstm_state_clip_nan", lstm_state_clip_nan) .SetInput("data", data) .SetInput("parameters", parameters) .SetInput("state", state) .SetInput("state_cell", state_cell) .CreateSymbol(symbol_name); } ``` ## Environment info (Required) Oops, the output contains chinese. ``` text ----------Python Info---------- Version : 3.6.7 Compiler : GCC 8.2.0 Build : ('default', 'Oct 22 2018 11:32:17') Arch : ('64bit', 'ELF') ------------Pip Info----------- Version : 9.0.1 Directory : /usr/lib/python3/dist-packages/pip ----------MXNet Info----------- No MXNet installed. ----------System Info---------- Platform : Linux-4.15.0-45-generic-x86_64-with-Ubuntu-18.04-bionic system : Linux node : jeff-X555LJ release : 4.15.0-45-generic version : #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 ----------Hardware Info---------- machine : x86_64 processor : x86_64 架構: x86_64 CPU 作業模式: 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 每核心執行緒數: 2 每通訊端核心數: 2 Socket(s): 1 NUMA 節點: 1 供應商識別號: GenuineIntel CPU 家族: 6 型號: 61 Model name: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz 製程: 4 CPU MHz: 1690.799 CPU max MHz: 2700.0000 CPU min MHz: 500.0000 BogoMIPS: 4393.55 虛擬: VT-x L1d 快取: 32K L1i 快取: 32K L2 快取: 256K L3 快取: 3072K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap intel_pt xsaveopt dtherm ida arat pln pts flush_l1d ----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0582 sec, LOAD: 1.3309 sec. Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.2522 sec, LOAD: 1.0342 sec. Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.3641 sec, LOAD: 1.1093 sec. Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.3570 sec, LOAD: 1.2316 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0492 sec, LOAD: 1.5305 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0237 sec, LOAD: 0.4429 sec. ``` Package used (Python/R/Scala/Julia): Hey. I'm using C++. ## Build info (Required if built from source) Compiler (gcc/clang/mingw/visual studio): gcc 7.3.0 MXNet commit hash: eebdd5f644e953da76ac35a898992ef5c83d3f30 (Paste the output of `git rev-parse HEAD` here.) Build config: (Paste the content of config.mk, or the build command.) ``` make # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. #------------------------------------------------------------------------------- # Template configuration for compiling mxnet # # If you want to change the configuration, please use the following # steps. Assume you are on the root directory of mxnet. First copy the this # file so that any local changes will be ignored by git # # $ cp make/config.mk . # # Next modify the according entries, and then compile by # # $ make # # or build in parallel with 8 threads # # $ make -j8 #------------------------------------------------------------------------------- #--------------------- # choice of compiler #-------------------- ifndef CC export CC = gcc endif ifndef CXX export CXX = g++ endif ifndef NVCC export NVCC = nvcc endif # whether compile with options for MXNet developer DEV = 0 # whether compile with debug DEBUG = 0 # whether to turn on segfault signal handler to log the stack trace USE_SIGNAL_HANDLER = # the additional link flags you want to add ADD_LDFLAGS = # the additional compile flags you want to add ADD_CFLAGS = #--------------------------------------------- # matrix computation libraries for CPU/GPU #--------------------------------------------- # whether use CUDA during compile USE_CUDA = 1 # add the path to CUDA library to link and compile flag # if you have already add them to environment variable, leave it as NONE USE_CUDA_PATH = /usr/local/cuda-9.2 # USE_CUDA_PATH = NONE # whether to enable CUDA runtime compilation ENABLE_CUDA_RTC = 1 # whether use CuDNN R3 library USE_CUDNN = 1 #whether to use NCCL library USE_NCCL = 0 #add the path to NCCL library USE_NCCL_PATH = NONE # whether use opencv during compilation # you can disable it, however, you will not able to use # imbin iterator USE_OPENCV = 1 #whether use libjpeg-turbo for image decode without OpenCV wrapper USE_LIBJPEG_TURBO = 0 #add the path to libjpeg-turbo library USE_LIBJPEG_TURBO_PATH = NONE # use openmp for parallelization USE_OPENMP = 1 # whether use MKL-DNN library: 0 = disabled, 1 = enabled # if USE_MKLDNN is not defined, MKL-DNN will be enabled by default on x86 Linux. # you can disable it explicity with USE_MKLDNN = 0 USE_MKLDNN = 0 # whether use NNPACK library USE_NNPACK = 0 # choose the version of blas you want to use # can be: mkl, blas, atlas, openblas # in default use atlas for linux while apple for osx UNAME_S := $(shell uname -s) ifeq ($(UNAME_S), Darwin) USE_BLAS = apple else USE_BLAS = openblas endif # whether use lapack during compilation # only effective when compiled with blas versions openblas/apple/atlas/mkl USE_LAPACK = 1 # path to lapack library in case of a non-standard installation USE_LAPACK_PATH = /usr/lib/x86_64-linux-gnu/openblas # add path to intel library, you may need it for MKL, if you did not add the path # to environment variable USE_INTEL_PATH = NONE # If use MKL only for BLAS, choose static link automatically to allow python wrapper ifeq ($(USE_BLAS), mkl) USE_STATIC_MKL = 1 else USE_STATIC_MKL = NONE endif #---------------------------- # Settings for power and arm arch #---------------------------- ARCH := $(shell uname -a) ifneq (,$(filter $(ARCH), armv6l armv7l powerpc64le ppc64le aarch64)) USE_SSE=0 USE_F16C=0 else USE_SSE=1 endif #---------------------------- # F16C instruction support for faster arithmetic of fp16 on CPU #---------------------------- # For distributed training with fp16, this helps even if training on GPUs # If left empty, checks CPU support and turns it on. # For cross compilation, please check support for F16C on target device and turn off if necessary. USE_F16C = #---------------------------- # distributed computing #---------------------------- # whether or not to enable multi-machine supporting USE_DIST_KVSTORE = 0 # whether or not allow to read and write HDFS directly. If yes, then hadoop is # required USE_HDFS = 0 # path to libjvm.so. required if USE_HDFS=1 LIBJVM=$(JAVA_HOME)/jre/lib/amd64/server # whether or not allow to read and write AWS S3 directly. If yes, then # libcurl4-openssl-dev is required, it can be installed on Ubuntu by # sudo apt-get install -y libcurl4-openssl-dev USE_S3 = 0 #---------------------------- # performance settings #---------------------------- # Use operator tuning USE_OPERATOR_TUNING = 1 # Use gperftools if found USE_GPERFTOOLS = 1 # path to gperftools (tcmalloc) library in case of a non-standard installation USE_GPERFTOOLS_PATH = # Link gperftools statically USE_GPERFTOOLS_STATIC = # Use JEMalloc if found, and not using gperftools USE_JEMALLOC = 1 # path to jemalloc library in case of a non-standard installation USE_JEMALLOC_PATH = # Link jemalloc statically USE_JEMALLOC_STATIC = #---------------------------- # additional operators #---------------------------- # path to folders containing projects specific operators that you don't want to put in src/operators EXTRA_OPERATORS = #---------------------------- # other features #---------------------------- # Create C++ interface package USE_CPP_PACKAGE = 1 #---------------------------- # plugins #---------------------------- # whether to use caffe integration. This requires installing caffe. # You also need to add CAFFE_PATH/build/lib to your LD_LIBRARY_PATH # CAFFE_PATH = $(HOME)/caffe # MXNET_PLUGINS += plugin/caffe/caffe.mk # WARPCTC_PATH = $(HOME)/warp-ctc # MXNET_PLUGINS += plugin/warpctc/warpctc.mk # whether to use sframe integration. This requires build sframe # [email protected]:dato-code/SFrame.git # SFRAME_PATH = $(HOME)/SFrame # MXNET_PLUGINS += plugin/sframe/plugin.mk ``` ## Error Message: (Paste the complete error message, including stack trace.) ``` text terminate called after throwing an instance of 'dmlc::Error' what(): [10:55:44] /usr/local/include/mxnet-cpp/symbol.hpp:219: Check failed: MXSymbolInferShape(GetHandle(), keys.size(), keys.data(), arg_ind_ptr.data(), arg_shape_data.data(), &in_shape_ size, &in_shape_ndim, &in_shape_data, &out_shape_size, &out_shape_ndim, &out_shape_data, &aux_shape_size, &aux_shape_ndim, &aux_shape_data, &complete) == 0 (-1 vs. 0) Stack trace returned 7 entries: [bt] (0) ./install/bin/issue_report(dmlc::StackTrace[abi:cxx11]()+0x54) [0x557702064b6c] [bt] (1) ./install/bin/issue_report(dmlc::LogMessageFatal::~LogMessageFatal()+0x40) [0x557702064e68] [bt] (2) ./install/bin/issue_report(mxnet::cpp::Symbol::InferShape(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<unsigned int, std::allo cator<unsigned int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<c har>, std::allocator<char> > const, std::vector<unsigned int, std::allocator<unsigned int> > > > > const&, std::vector<std::vector<unsigned int, std::allocator<unsigned int> >, std::allocator< std::vector<unsigned int, std::allocator<unsigned int> > > >*, std::vector<std::vector<unsigned int, std::allocator<unsigned int> >, std::allocator<std::vector<unsigned int, std::allocator<uns igned int> > > >*, std::vector<std::vector<unsigned int, std::allocator<unsigned int> >, std::allocator<std::vector<unsigned int, std::allocator<unsigned int> > > >*) const+0x377) [0x55770206$ 2af] [bt] (3) ./install/bin/issue_report(mxnet::cpp::Symbol::InferArgsMap(mxnet::cpp::Context const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, mxne$ ::cpp::NDArray, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, $ td::allocator<char> > const, mxnet::cpp::NDArray> > >*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, mxnet::cpp::NDArray, std::less<std::__cxx11::$ asic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, mxnet::cpp:$ NDArray> > > const&) const+0x1dc) [0x5577020680a0] [bt] (4) ./install/bin/issue_report(main+0x459) [0x557702063614] [bt] (5) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fcfeb88db97] [bt] (6) ./install/bin/issue_report(_start+0x2a) [0x557702062afa] ``` ## Minimum reproducible example (If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.) ``` cpp #include <iostream> #include "mxnet-cpp/MxNetCpp.h" using namespace std; using namespace mxnet::cpp; int main(int argc, char** argv) { Symbol input = Symbol::Variable("input"); Symbol rnnParam = Symbol::Variable("rnn_param"); Symbol rnnState = Symbol::Variable("rnn_state"); Symbol dummySym = Symbol::Variable(); const int numRnnHidden = 3, numRnnLayer = 1; #define FAIL_CODE #ifdef FAIL_CODE Symbol rnn = RNN("rnn", input, rnnParam, rnnState, dummySym, // numRnnHidden, numRnnLayer, RNNMode::kRnn_tanh); #else Symbol rnn = Operator("RNN") .SetParam("state_size", numRnnHidden) .SetParam("num_layers", numRnnLayer) .SetParam("mode", "rnn_tanh") .SetInput("data", input) .SetInput("parameters", rnnParam) .SetInput("state", rnnState) .CreateSymbol("rnn"); #endif Symbol output = rnn[0]; Context ctx = Context::cpu(); map<string, NDArray> args; args["input"] = NDArray(Shape(1, 1, 4), ctx); rnn.InferArgsMap(ctx, &args, args); Executor* exe = rnn.SimpleBind(ctx, args); cout << "Parameters: " << endl; for (auto& arg : args) { cout << " " << arg.first << ": ("; for (auto& i : arg.second.GetShape()) { cout << i << " "; } cout << ")" << endl; } cout << "Output size: "; for (auto& out : exe->outputs) { cout << "("; for (auto& i : out.GetShape()) { cout << i << " "; } cout << ")"; } cout << endl; vector<mx_float> nums({0.2, 0.3, 0.4, 0.5}); args["input"].SyncCopyFromCPU(nums.data(), nums.size()); cout << "Input" << args["input"] << endl; exe->Forward(false); NDArray::WaitAll(); cout << "Output " << exe->outputs[0] << endl; MXNotifyShutdown(); return 0; } ``` ## Steps to reproduce (Paste the commands you ran that produced the error.) 1. Compile and run the code above. This code will fail at run time. 2. check the code at `#define FAIL_CODE` ## What have you tried to solve it? 1. Yes. I use `Operator` class to create symbol ``` cpp Symbol rnn = Operator("RNN") .SetParam("state_size", numRnnHidden) .SetParam("num_layers", numRnnLayer) .SetParam("mode", "rnn_tanh") .SetInput("data", input) .SetInput("parameters", rnnParam) .SetInput("state", rnnState) .CreateSymbol("rnn"); ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
