jwfromm opened a new issue #9842: Custom Function Shape Inference URL: https://github.com/apache/incubator-mxnet/issues/9842 Note: Providing complete information in the most concise form is the best way to get help. This issue template serves as the checklist for essential information to most of the technical issues and bug reports. For non-technical issues and feature requests, feel free to present the information in what you believe is the best form. For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io ## Description Custom functions seem to be unable to do deferred shape inference, or the methods that enable deferred inference are non-obvious and require examples. ## Environment info (Required) ``` ----------Python Info---------- Version : 3.6.4 Compiler : GCC 7.2.0 Build : ('default', 'Jan 16 2018 18:10:19') Arch : ('64bit', '') ------------Pip Info----------- Version : 9.0.1 Directory : /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/pip ----------MXNet Info----------- Version : 1.1.0 Directory : /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet Commit Hash : 07a83a0325a3d782513a04f47d711710972cb144 ----------System Info---------- Platform : Linux-4.13.0-32-generic-x86_64-with-debian-stretch-sid system : Linux node : 243bb3cedee3 release : 4.13.0-32-generic version : #35~16.04.1-Ubuntu SMP Thu Jan 25 10:13:43 UTC 2018 ----------Hardware Info---------- machine : x86_64 processor : x86_64 Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Core(TM) i7-6850K CPU @ 3.60GHz Stepping: 1 CPU MHz: 3600.001 CPU max MHz: 4000.0000 CPU min MHz: 1200.0000 BogoMIPS: 7200.00 Virtualization: VT-x Hypervisor vendor: vertical Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts ----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0254 sec, LOAD: 0.5580 sec. Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1541 sec, LOAD: 0.0660 sec. Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.1920 sec, LOAD: 0.1901 sec. Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.1562 sec, LOAD: 1.5483 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0516 sec, LOAD: 0.1203 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0299 sec, LOAD: 0.0624 sec. ``` Package used (Python/R/Scala/Julia): I'm using Python 3.6 ## Build info (Required if built from source) Pip Install ## Error Message: ``` /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py:512: UserWarning: Cannot decide shape for the following arguments (0s in shape means unknown dimensions). Consider providing them as input: conv2_weight: (20, 0, 5, 5) **{i.name: getattr(j, attr) for i, j in zip(inputs, args)}) --------------------------------------------------------------------------- DeferredInitializationError Traceback (most recent call last) /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in forward(self, x, *args) 567 return self._call_cached_op(x, *args) --> 568 params = {i: j.data(ctx) for i, j in self._reg_params.items()} 569 except DeferredInitializationError: /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in <dictcomp>(.0) 567 return self._call_cached_op(x, *args) --> 568 params = {i: j.data(ctx) for i, j in self._reg_params.items()} 569 except DeferredInitializationError: /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/parameter.py in data(self, ctx) 388 """ --> 389 return self._check_and_get(self._data, ctx) 390 /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/parameter.py in _check_and_get(self, arr_list, ctx) 182 "You can also avoid deferred initialization by specifying in_units, " \ --> 183 "num_features, etc., for network layers."%(self.name)) 184 raise RuntimeError( DeferredInitializationError: Parameter conv2_weight has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers. During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) <ipython-input-10-5fdef27b91d5> in <module>() ----> 1 net(test) /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in __call__(self, *args) 358 def __call__(self, *args): 359 """Calls forward. Only accepts positional arguments.""" --> 360 return self.forward(*args) 361 362 def forward(self, *args): /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/nn/basic_layers.py in forward(self, x) 51 def forward(self, x): 52 for block in self._children: ---> 53 x = block(x) 54 return x 55 /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in __call__(self, *args) 358 def __call__(self, *args): 359 """Calls forward. Only accepts positional arguments.""" --> 360 return self.forward(*args) 361 362 def forward(self, *args): /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in forward(self, x, *args) 568 params = {i: j.data(ctx) for i, j in self._reg_params.items()} 569 except DeferredInitializationError: --> 570 self._finish_deferred_init(self._active, x, *args) 571 572 if self._active: /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in _finish_deferred_init(self, hybrid, *args) 458 459 def _finish_deferred_init(self, hybrid, *args): --> 460 self.infer_shape(*args) 461 if hybrid: 462 for is_arg, i in self._cached_op_args: /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in infer_shape(self, *args) 519 def infer_shape(self, *args): 520 """Infers shape of Parameters from inputs.""" --> 521 self._infer_attrs('infer_shape', 'shape', *args) 522 523 def infer_type(self, *args): /opt/conda/envs/pytorch-py3.6/lib/python3.6/site-packages/mxnet/gluon/block.py in _infer_attrs(self, infer_fn, attr, *args) 511 arg_attrs, _, aux_attrs = getattr(out, infer_fn)( 512 **{i.name: getattr(j, attr) for i, j in zip(inputs, args)}) --> 513 sdict = {i: j for i, j in zip(out.list_arguments(), arg_attrs)} 514 sdict.update({name : attr for name, attr in \ 515 zip(out.list_auxiliary_states(), aux_attrs)}) TypeError: zip argument #2 must support iteration ``` ## Minimum reproducible example ``` import mxnet as mx from mxnet import gluon, autograd, nd class Binarize(mx.operator.CustomOp): def forward(self, is_train, req, in_data, out_data, aux): """Implements forward computation. is_train : bool, whether forwarding for training or testing. req : list of {'null', 'write', 'inplace', 'add'}, how to assign to out_data. 'null' means skip assignment, etc. in_data : list of NDArray, input data. out_data : list of NDArray, pre-allocated output buffers. aux : list of NDArray, mutable auxiliary states. Usually not used. """ x = in_data[0] #y = nd.sign(x) y = x self.assign(out_data[0], req[0], y) def backward(self, req, out_grad, in_data, out_data, in_grad, aux): """Implements backward computation req : list of {'null', 'write', 'inplace', 'add'}, how to assign to in_grad out_grad : list of NDArray, gradient w.r.t. output data. in_grad : list of NDArray, gradient w.r.t. input data. This is the output buffer. """ x = in_data[0] dy = out_grad[0] dx = dy*(nd.abs(x) <= 1) self.assign(in_grad[0], req[0], dx) @mx.operator.register("binarize") # register with name "sigmoid" class BinarizeProp(mx.operator.CustomOpProp): def __init__(self): super(BinarizeProp, self).__init__(True) def list_arguments(self): # this can be omitted if you only have 1 input. return ['data'] def list_outputs(self): # this can be omitted if you only have 1 output. return ['output'] def infer_shape(self, in_shapes): """Calculate output shapes from input shapes. This can be omited if all your inputs and outputs have the same shape. in_shapes : list of shape. Shape is described by a tuple of int. """ data_shape = in_shapes[0] output_shape = data_shape # return 3 lists representing inputs shapes, outputs shapes, and aux data shapes. return (data_shape,), (output_shape,), () def create_operator(self, ctx, in_shapes, in_dtypes): # create and return the CustomOp class. return Binarize() class BinarizeBlock(gluon.HybridBlock): def hybrid_forward(self, F, x): return F.Custom(x, name='binarize', op_type='binarize') class BinaryConv(Conv2D): def hybrid_forward(self, F, x, weight, bias=None): bin_w = F.Custom(weight, name='binarize', op_type='binarize') return super(BinaryConv, self).hybrid_forward(F, x, weight=bin_w, bias=bias) net = gluon.nn.Sequential() net.add(BinaryConv(channels=20, kernel_size=5, activation='relu')) net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=mx.cpu()) test = nd.random.normal(shape=(1, 3, 224, 224)) net(test) ``` ## Steps to reproduce 1. Simply run above code ## What have you tried to solve it? 1. Tried to find issues with shape inference in the custom function, but everything seems fine. Using BinarizeBlock in sequential does not cause an issue, it is only when the custom function is used in BinaryConv that the above error appears. I don't see any obvious reasons this issue is popping up, since it looks like the infer_shape function of Binarize is working. I'd love to get this figured out!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
