taliesinb commented on issue #8949: New layer: split_like. URL: https://github.com/apache/incubator-mxnet/pull/8949#issuecomment-392828806 @piiswrong > write a custom operator in python we can already make a fork of MXNet that has this custom operator implemented in C++, and in fact we have done that, because the performance benefit is so large. this isn't exportable as ordinary MXNet so customers can't use this to do efficient inference from pure C++ / embedded context, and indeed in C++ it's even more urgent because they *can't* unroll the network as they don't have access to Mathematica. > create a symbol for each seq length/batch_size. again, think of deployment: a side channel that contains special information about how to modify symbols' parameters to make it sequence or batch size flexible is complex and error prone, and makes deployed code go from requiring a simple shapeinfer to requiring something much more complex. it's *so simple* to make ordinary shape inference solve this problem, you just need a way to get a reshape to be able to use a dimension from the shape of a second reference tensor. split_like is one way of doing it that, but there are others that I can talk about, like adding another code to reshape. > implement RNN layer for CPU (on the schedule) there are many, many architectures that this doesn't help with. e.g. deep speech. the problem is a general one and needs a non-hacky solution. maybe split_like is not the correct solution but i would like a more productive response than "we don't think this problem is worth solving". it definitely is worth solving in a principled way because unrolling is such a fundamental aspect of net compilation.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
