OK, there is some miscommunication in here I guess. We only need to do a "canonization" step in python API that goes a symbol to symbol translation layer. It can be done in purely in python, and there is no need for going "down" into c++ to do this.
For example, the current nnvm.from_mxnet API takes Module or Gluon module and get you back nnvm/top graph in python. All we are asking for is to decomposing it into def mxnet_to_onnx(module): nnvm_graph, params = nnvm_from_mxnet(module) onnx = nnvm_to_onnx(nnvm_graph, params) return onnx This allows nnvm_from_mxnet to be reused for other purposes, like compiling API to deployable modules Tianqi On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay <lupe...@gmail.com> wrote: > Tianqi: > Thanks for detailing the trends. I fully agree that ONNX is just a graph > serialization format – nothing more, nothing less. I also think we all > agree that this simple mechanism holds lots of value to DL users since it > allows them to move between frameworks easily (e.g. train with MXNet, > deploy on a mobile device with Caffe2, or the other way around). > As you said, In Memory IR is different than serialization formats such as > ONNX. They are designed to make the runtime execution as efficient as > possible, leveraging software and hardware optimizations. They are indeed > complex, and where the “meat” is. > (BTW ONNX regards itself as an “IR” format, but not in the same sense as > NNVM). > > At the end of the day, Roshani is aiming to deliver a simple functionality > to MXNet users: (1) take an ONNX file, and load it into MXNet so you get a > graph+weights you can work with (2) Given a trained model, save it as an > ONNX file. Since MXNet users do not interact with NNVM directly, but rather > interact with MXNet API (MXNet Module), isn’t the simplest thing to do is > just to construct the Module “on the fly” using MXNet API? Taking the other > approach, we will go from the top level MXNet “load” API, go “down” to NNVM > to construct the graph, go back up to MXNet to expose it as a Module. This > seems to complex and does not add any benefit. In whatever way we construct > the MXNet Module object, NNVM will always be the underlying in memory IR > that is being executed, so why not take the simpler route? > > Hagay > > On 10/18/17, 19:42, "Tianqi Chen" <workc...@gmail.com on behalf of > tqc...@cs.washington.edu> wrote: > > Hi Chris: > > There is no intention to move things away from mxnet. The reduction of > lines of code by having a better design in general, and usually, you > write > less redundant code by benefiting from better design. As I may quote: > "the > best design is not achieved not when there is nothing to add, but when > there is nothing to be taken away." > > MXNet has always benefited from this philosophy and improves with the > new > designs and proper modularization. For example, we see such reduction > and > convenience happening when migrating from MXNet's legacy op to the > NNVM's mechanism. The new mechanism now enables things like sparse > aware > support and other stuff which would be much harder to support. > > The nnvm/tvm stack comes brings the same benefit(if not more) and it > will > only add more features to MXNet itself. Offering more hardware > backends and > optimization, allowing us to write less code and spent less time to > optimize for each backend by going through TVM > > Tianqi > > On Wed, Oct 18, 2017 at 7:15 PM, Chris Olivier <cjolivie...@gmail.com> > wrote: > > > Reduce code base of mxnet? By increasing scope of the dmlc modules? > Is the > > intent to make mxnet a thin language wrapper around a group of dmlc > > modules? > > > > > > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen < > tqc...@cs.washington.edu> > > wrote: > > > > > To better answer Hagay's question, I would like to dive down a bit > deeper > > > on the relation between MXNet, NNVM and model exchange format like > ONNX. > > > > > > There are two major trends in deep learning systems now: > > > > > > - Common serializable formats, like ONNX and CoreML, that defines > the > > model > > > exchange format. > > > - The in-memory graph IR for quick optimization and JIT. NNVM, > > Tensorflow's > > > XLA falls into this category. > > > > > > The exchange formats are great, it only poses a layer of > conversion, > > which > > > is good for exchange. The real meat still comes from the > compilation and > > > JIT pipeline you have to offer. For that, we will need an > in-memory IR, > > > because of the cost of constructing, serialize could be high for > the > > > exchange formats like protobuf. And usually, the exchange formats > are > > > designed in a minimalistic fashion, making it less easy to extend > more > > > information to support in-depth optimization like automatic > quantization, > > > accelerator support. > > > > > > The current MXNet relies on NNVM for in-memory IR manipulation but > does > > not > > > contain a compilation component that compiles to the hardware > backends. > > > Doing export to an exchange format and then back into NNVM run the > > > compilation poses too much burden that JIT compiler could pay. > Using the > > > same in-memory graph IR as the compilation stack give much more > advantage > > > in terms of this. > > > > > > The newly introduces nnvm/top and compiler offers in-memory graph > > > optimization and compilation and offers more hardware backend > directly > > via > > > TVM. We already see promising results in edge deployments with a > much > > lower > > > overhead of runtime. We will further benefit quickly from more > graph > > > optimizations that it has to offer. > > > > > > Building support around this new paradigm offers us advantage of > being > > > future compatible and takes full benefit of the points I mentioned > above > > > > > > Tianqi > > > > > > > > > > > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko, Hagay <lupe...@gmail.com > > > > wrote: > > > > > > > Roshani – this is an exciting initiative, ONNX support on MXNet > will > > > > enable more users to ramp up on MXNet, which is great. > > > > > > > > Tianqi – a few questions and thoughts about your note: > > > > - “More hardware backends to mxnet” – MXNet users get the same > benefit > > of > > > > HW support implementing ONNX import on top of MXNet symbolic, > right? > > > > - “NNVM Compiler now received contributions from AWS, UW and > many other > > > > folks in MXNet community.” – agreed it is ramping up, but when > you look > > > at > > > > the data, it is clear that it is very early on for NNVM. Looking > at the > > > > repo, it has overall 223 commits, 0 releases. Compare it to > MXNet with > > > 6136 > > > > commits and 32 releases. It seems to be still early on for NNVM, > and > > for > > > a > > > > more reliable initial implementation building the import on top > of > > MXNet > > > is > > > > easier, faster and safer. MXNet has lots of users already using > the > > > > Symbolic API which hopefully mean that is a mature API that is > not > > likely > > > > to have breaking changes or major issues. > > > > > > > > I’m supportive option 1 proposed by Roshani (building serde on > top of > > > > MXNet symbolic), but to do it as an encapsulated implementation > detail, > > > so > > > > the implementation can be migrated to NNVM or another > implementation in > > > the > > > > future, if at that point it seems like the right thing to do. > > > > > > > > Interested in hearing other opinions though… > > > > > > > > Hagay > > > > > > > > On 10/18/17, 14:13, "Tianqi Chen" <workc...@gmail.com on behalf > of > > > > tqc...@cs.washington.edu> wrote: > > > > > > > > I am strongly recommending going through the nnvm/top. One > major > > > > reason in > > > > here is that the support of nnvm/top layer NOT ONLY mean > > > compatibility > > > > of > > > > model format with onnx. These are the major benefits: > > > > > > > > > > > > - More hardware backends to mxnet, including opencl, metal, > > Raspberry > > > > Pi, > > > > web browser. These things are automatically enabled by going > > through > > > > this > > > > layer. In general, we design nnvm/tvm stack to resolve the > > challenge > > > of > > > > current mxnet's weakness in terms deploying to more hardware > > > backends. > > > > > > > > - More frontend capabilities, nnvm's gluon style IR ingests > now > > from > > > > CoreML, ONNX and in future keras. Supporting those will > reduce the > > > > amount > > > > of engineering effort needed. > > > > > > > > - Future compatibility. We all agree that the future being > migrated > > > to > > > > gluon's API. NNVM/top tries to look ahead by directly > adopting the > > > > symbolic > > > > API to be gluon. > > > > > > > > > > > > I would also like to correct some of the mentioned facts with > > regard > > > to > > > > nnvm/tvm stack > > > > > > > > 1. Nascent project with few contributors > > > > > > > > NNVM Compiler now received contributions from AWS, UW and > many > > other > > > > folks > > > > in MXNet community. NNVM itself is already being used by > MXNet. > > > > MXNet's internal IR is migrating toward gluon, and its final > form > > > being > > > > nnvm/top > > > > > > > > 3. Does not support all operators that exist in MXNet > Symbolic > > API > > > > > > > > Neither NNVM/top or onnx support all operators that exist in > mxnet > > > > symbolic > > > > API. The end goal here is mainly to make nnvm/top onnx > compatible, > > > > which is > > > > a more reasonable goal. > > > > > > > > 4. No CI Pipeline and testcases > > > > > > > > NNVM already contains a compiler contains unittests and ci > tested > > > with > > > > integration https://github.com/dmlc/nnvm, with a CI > pipline that > > is > > > > well > > > > tested on CPU and GPU cases for front-ends. > > > > > > > > Tianqi > > > > > > > > > > > > On Wed, Oct 18, 2017 at 1:41 PM, Roshani Nagmote < > > > > roshaninagmo...@gmail.com> > > > > wrote: > > > > > > > > > Hi guys, > > > > > > > > > > > > > > > I am working on supporting ONNX < > https://github.com/onnx/onnx> > > > > pre-trained > > > > > models in Apache MXNet and would like to seek your opinion > on the > > > > choice of > > > > > implementation. I also have created a GitHub issue > > > > > <https://github.com/apache/incubator-mxnet/issues/8319>. > > > Supporting > > > > ONNX > > > > > in > > > > > MXNet will enable users to move between frameworks with > their > > > > models, this > > > > > will also enable MXNet project to be a part of the ONNX > open > > > > standard and > > > > > steer the direction of ONNX. > > > > > > > > > > > > > > > For those who don’t know ONNX, ONNX is an open source > format for > > AI > > > > models > > > > > which enables models to be transferred between frameworks. > Refer > > to > > > > > https://github.com/onnx/onnx for more details. > > > > > > > > > > > > > > > To implement the import/export functionality in MXNet, I > propose > > to > > > > expose > > > > > a MXNet python module “serde”(name taken from Apache Hive > > project) > > > > with the > > > > > following methods supporting different formats: > > > > > > > > > > sym, params = mxnet.serde.import(other_format_file, > > > > other_format=‘onnx’) > > > > > > > > > > other_format_file = mxnet.serde.export(mxnet_sym, > mxnet_params, > > > > ‘onnx’) > > > > > > > > > > > > > > > The implementation under the hood can be done in two ways: > > > > > > > > > > > > > > > 1) Implement at the MXNet layer by parsing the ONNX > model(in > > > protobuf > > > > > format) and turn into MXNet Symbolic operators and build > MXNet > > > model > > > > > directly. Similarly, I can convert the MXNet model to ONNX > format > > > at > > > > this > > > > > layer. > > > > > > > > > > > > > > > 2) The DMLC community has released the nnvm/tvm complier > and an > > > > > intermediate representation of the models, refer: > > > > > http://www.tvmlang.org/2017/10/06/nnvm/tvm-compiler- > > > > announcement.html > > > > > <http://www.tvmlang.org/2017/10/06/nnvm-compiler- > > announcement.html > > > > > > > > > > > > > > Based on the conversation on the GitHub issue > > > > > <https://github.com/apache/incubator-mxnet/issues/8319> I > > opened, > > > Mu > > > > > mentioned that MXNet would use nnvm/tvm as the backend in > the > > > future. > > > > > > > > > > > > > > > We could hook into this layer to implement the > import/export > > > > functionality. > > > > > nnvm/tvm has ONNX 0.1 version import implemented. > > > > > > > > > > For import, > > > > > > > > > > 1. > > > > > > > > > > I will need to enhance nnvm/tvm’s importer to support > ONNX 0.2 > > > > > 2. > > > > > > > > > > Implement nnvm/tvm->mxnet symbolic operators. > > > > > > > > > > For export: > > > > > > > > > > > > > > > 1. > > > > > > > > > > mxnet->nnvm/tvm ( nnvm/tvm provides this implementation > > already) > > > > > 2. > > > > > > > > > > I will need to Implement nnvm/tvm>onnx. > > > > > > > > > > > > > > > These are the pros and cons I see in the above approaches: > > > > > > > > > > 1. > > > > > > > > > > Import/export at mxnet layer > > > > > > > > > > Pros: > > > > > > > > > > 1. > > > > > > > > > > Stable APIs currently used by users. > > > > > 2. > > > > > > > > > > Larger Apache MXNet community of contributors. > > > > > 3. > > > > > > > > > > CI pipeline to catch bugs. > > > > > 4. > > > > > > > > > > Comparatively less time to implement and put it in the > hands > > of > > > > the > > > > > users. > > > > > > > > > > Cons: > > > > > > > > > > 1. > > > > > > > > > > In the future we may have to reimplement at the nnvm/tvm > > layer, > > > > in case > > > > > MXNet moves to the nnvm/tvm backend(assuming it will > move). > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > Import/export at nnvm/tvm layer > > > > > > > > > > Pros: > > > > > > > > > > 1. > > > > > > > > > > Less engineering work in case mxnet moves to nnvm/tvm > > > > > 2. > > > > > > > > > > nnvm/tvm would become a hub to convert to different > formats. > > > > > 3. > > > > > > > > > > nnvm operators are more in parity with mxnet’s gluon > APIs this > > > > could be > > > > > useful in case Gluon becomes the only standard that > MXNet will > > > > support. > > > > > > > > > > Cons: > > > > > > > > > > 1. > > > > > > > > > > Nascent project with few contributors > > > > > 2. > > > > > > > > > > Does not support all operators that exist in MXNet > Symbolic > > API > > > > > 3. > > > > > > > > > > No CI Pipeline > > > > > 4. > > > > > > > > > > Current Apache MXNet project does not use nnvm/tvm > backend > > > > > 5. > > > > > > > > > > mxnet->nnvm/tvm backend needs more testing and user > feedback. > > > > > > > > > > > > > > > Any suggestions on both of these approaches? From user's > > > > perspective, this > > > > > will be an implementation detail that is not exposed. > > > > > > > > > > Thanks, > > > > > > > > > > Roshani > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >