- We can implement onnx importer directly through gluon API - For the exporter, it is recommended to canonicalize them to gluon's op def, before converting into onnx. Such canonicalization utility is support via nnvm/top. - If we don't want to rely on the code in the nnvm/top(which I recommend at least to take a look at) and want to do everything in MXNet, we can simply write a python data structure that implements the spec and put it in MXNet, which I estimate would costs around 500 lines of python
Tianqi On Thu, Oct 19, 2017 at 3:14 PM, Lupesko, Hagay <lupe...@gmail.com> wrote: > Tianqi, > > Can you clarify your proposal to go through mxnet/gluon? > - Are you proposing to implement “import ONNX” in Gluon by dynamically > building the ONNX graph using Gluon API? > - Gluon does not have a “save” API, you can only save weights, not the > network. How can we export Gluon model to ONNX? > > Hagay > > On 10/19/17, 13:26, "Tianqi Chen" <workc...@gmail.com on behalf of > tqc...@cs.washington.edu> wrote: > > Again my recommendation is to go through mxnet/gluon (which in that > case > core operator set of NNVM) with the following technical reason: > > - Enjoy future compatibility and compilation pipeline that other > frameworks > do not have > - Articulate Apache MXNet's need of core operators clearly to give > ApacheMXNet's position clear in influencing exchange format design. > - We agreed on the same user-facing API to be in MXNet, and going > through > mxnet/gluon(nnvm) does not prevent that from happening. > > Thanks for the discussions and to move forward with a decision. We can > call > for a vote on among the current committers on this issue. > > Tianqi > > On Thu, Oct 19, 2017 at 1:17 PM, Lupesko, Hagay <lupe...@gmail.com> > wrote: > > > This thread is long and forked, but I just wanted to re-iterate my > > proposal to have ONNX import/export implemented in MXNet /contrib as > > experimental. > > I think this is a good first step that hopefully allows MXNet users > to > > easily leverage ONNX, but still leave a clear path to update the > > implementation later on if it makes sense. > > > > How do we move forward with a decision? > > > > On 10/19/17, 12:14, "Tianqi Chen" <workc...@gmail.com on behalf of > > tqc...@cs.washington.edu> wrote: > > > > Hi Hen: > > > > It is sad to think DMLC adversarially in this matter. DMLC > projects > > adopt > > apache way of doing things and we are planning moving more > modules into > > Apache. > > > > All the discussion so far happens under the Apache manner and I > do > > think > > that healthy discussion on critical design issues is important. > It is > > unfair to say something is rotten just when there is a debate > going on > > in > > terms of technical issues. > > > > They are merely based on our technical assessment of what is > better > > for the > > project in general, instead of being political or chanting the > detailed > > credits or ownership of the code. > > > > > > Tianqi > > > > On Thu, Oct 19, 2017 at 12:03 PM, Hen <bay...@apache.org> wrote: > > > > > What I think I'm seeing here is that: > > > > > > * MXNet moved to Apache. > > > * Some of the code it relied on (50% per the last release > thread, > > but that > > > may have been bombastic) remained at DMLC. > > > * The MXNet community thinks one thing. > > > * The DMLC community (which is a subset of the MXNet community > that > > runs > > > under different community rules) thinks another. > > > > > > Something is rotten. > > > > > > One solution: The MXNet community forks the DMLC code it > relies on > > into the > > > MXNet codebase and moves on without being tied down by the > decisions > > of a > > > non-compatible community. > > > > > > Hen > > > > > > > > > > > > On Thu, Oct 19, 2017 at 11:59 AM, Tianqi Chen < > > tqc...@cs.washington.edu> > > > wrote: > > > > > > > Here are the detailed points(sorry for resenting it over > again) > > > > > > > > Technical Reasoning: > > > > > > > > - Model exchange format like CoreML and ONNX are not > lossless and > > > > complete. They are designed to an contain a core set of the > > > > minimum operators to support necessary inference tasks like > > ResNet, etc. > > > > So you cannot rely on a bi-directional serialization with > this > > format for > > > > all MXNet models. As a simple example, broadcast add/mul is > > simply not > > > > supported in onnx. > > > > > > > > - Same problem goes for compilation and in-memory IR, a core > set > > of most > > > > interesting primitives are effectively supported. > > > > > > > > - Either in the case of supporting exchange format, or > in-memory > > IR, we > > > > need to make the decision on what core set of operators are > we > > interested > > > > in support. We cannot simply say let us support everything > from > > the > > > > beginning due to the limitations of the exchange format. > > > > > > > > - It is crucial for us articulate what is the core set of > > operators we > > > care > > > > about in MXNet. Either in terms of providing guidelines to > the > > community, > > > > or influence the design of model exchange format them-selfs > to > > move in > > > > favor of MXNet. > > > > > > > > - nnvm/top is that initial core set of operators for both > compiler > > > support > > > > and exchange purposes. It is modeled under numpy and gluon, > under > > the > > > > supervision of Eric, Me and Mu. It can be bi-directionally > > exchanged > > > with > > > > a current mxnet operator without loss of information. > > > > > > > > The Effort of Engineering: > > > > > > > > - Because nnvm/top is modeled with numpy and gluon, mxnet<-> > > nnvm/top is > > > > quite easy, and we already have one direction done. I would > be > > very happy > > > > to answer any questions on another. No information loss will > > happen with > > > > this path. > > > > > > > > - mxnet/symbol or nnvm/symbol(they are essentially the same > thing > > with a > > > > bit different op defs) <- onnx is harder. There has been > already > > enough > > > > effort to support onnx 0.1 as Roshani mentioned. Which is > > contributed by > > > > Zhi Zhang, another Apache MXNet committer. Zhi already > provided > > code to > > > > alleviate this process. Built code on the existing effort > would > > actually > > > > make the problem easier. > > > > > > > > On Thu, Oct 19, 2017 at 11:55 AM, Tianqi Chen < > > tqc...@cs.washington.edu> > > > > wrote: > > > > > > > > > As for where the code should sit, we have seen onnx's > support for > > > caffe2 > > > > > sitting on a separate repo. My suggestion would be put > code > > under > > > > nnvm/top > > > > > and migrate into mxnet eventually when the top components > get > > into > > > MXNet, > > > > > hopefully by end of next month. > > > > > > > > > > I have elaborated my point in the last email thread. This > (going > > > through > > > > > nnvm/top) is an important design decision both technically > > > (compilation, > > > > > more hardware) and strategically (articulate our core set > of > > operators > > > > and > > > > > influence the model exchange format). > > > > > > > > > > I am glad to see the discussion happening and surely there > is > > doubt, as > > > > > with every big step of changes. But with the rapidly > changing > > pace of > > > > deep > > > > > learning systems, this is the direction that we thought is > most > > > > promising. > > > > > We can call for a vote if necessary among the committers > for the > > design > > > > > decision if there is still debate on this issue. Or we can > keep > > the > > > > > discussion open and start some effort around nnvm/top to > see how > > it > > > goes > > > > > > > > > > Tianqi > > > > > > > > > > On Thu, Oct 19, 2017 at 11:15 AM, Lupesko, Hagay < > > lupe...@gmail.com> > > > > > wrote: > > > > > > > > > >> Mu, > > > > >> > > > > >> You’re mentioning plans for a new model format and > compiler, > > but I > > > don’t > > > > >> recall seeing it shared/discussed on the dev list. Can > you share > > > these, > > > > so > > > > >> it is more accessible to folks to understand the plan and > > vision? > > > > >> > > > > >> Personally, I think it will be a shame to add ONNX > support to > > MXNet, > > > and > > > > >> have it implemented outside of MXNet. At the end of the > day, it > > makes > > > > >> things difficult for MXNet users. > > > > >> > > > > >> Hagay > > > > >> > > > > >> On 10/19/17, 10:01, "Mu Li" <limu...@gmail.com on behalf > of > > > > >> muli....@gmail.com> wrote: > > > > >> > > > > >> I'm speaking under my "MXNet contributor" hat. > > > > >> > > > > >> It will be sad that our new model format and compiler > is not > > > > >> supported by > > > > >> our own contributors. It puts us in a bad position to > reach > > out to > > > > >> outside > > > > >> to ask for support. > > > > >> > > > > >> If you really what to do it with the onnx <-> mxnet > way, I > > suggest > > > > >> putting > > > > >> the codes under https://github.com/aws. > > > > >> > > > > >> Best > > > > >> Mu > > > > >> > > > > >> On Thu, Oct 19, 2017 at 9:51 AM, Lupesko, Hagay < > > > lupe...@gmail.com> > > > > >> wrote: > > > > >> > > > > >> > Since there seems to be a difficulty to reach a > consensus > > here, > > > > and > > > > >> this > > > > >> > is a new area, maybe a good compromise would be to > > contribute > > > this > > > > >> under > > > > >> > /contrib as experimental, with whatever way Roshani > > thinks makes > > > > >> sense. > > > > >> > Once there is code in place, and MXNet users and > > contributors > > > are > > > > >> able to > > > > >> > check it out, we can consider future steps. > > > > >> > > > > > >> > Does this proposal make sense to folks? > > > > >> > > > > > >> > On 10/18/17, 23:01, "Tianqi Chen" < > workc...@gmail.com on > > behalf > > > > of > > > > >> > tqc...@cs.washington.edu> wrote: > > > > >> > > > > > >> > I want to offer one last thing in terms of > technical > > > details. > > > > I > > > > >> > mentioned > > > > >> > two trends in the deep learning systems. There > is one > > last > > > > >> thing that > > > > >> > is > > > > >> > omitted. How should we build a good deploy end > for > > deep > > > > learning > > > > >> > models. > > > > >> > > > > > >> > There is always a paradox to this problem: > > > > >> > > > > > >> > - On one hand, the deployment end needs to be > > lightweight > > > and > > > > >> portable. > > > > >> > - We want a lot of optimizations (memory layout > > compute) and > > > > >> feature > > > > >> > support, this makes the project big. > > > > >> > > > > > >> > All the existing systems suffer from this > problem. The > > > > solution > > > > >> is > > > > >> > simple, > > > > >> > separates the optimization part from the actual > > runtime and > > > > >> compiles > > > > >> > the > > > > >> > things down to a bare metal module. And this is > the > > solution > > > > >> nnvm/top > > > > >> > compiler pipeline offer, which I believe will > become a > > > > standard > > > > >> > practice of > > > > >> > deployment and where all systems go to > > > > >> > > > > > >> > Tianqi > > > > >> > > > > > >> > On Wed, Oct 18, 2017 at 10:03 PM, Tianqi Chen < > > > > >> > tqc...@cs.washington.edu> > > > > >> > wrote: > > > > >> > > > > > >> > > OK, there is some miscommunication in here I > > guess. We > > > only > > > > >> need to > > > > >> > do a > > > > >> > > "canonization" step in python API that goes a > > symbol to > > > > symbol > > > > >> > translation > > > > >> > > layer. It can be done in purely in python, and > > there is no > > > > >> need for > > > > >> > going > > > > >> > > "down" into c++ to do this. > > > > >> > > > > > > >> > > For example, the current nnvm.from_mxnet API > takes > > Module > > > or > > > > >> Gluon > > > > >> > module > > > > >> > > and get you back nnvm/top graph in python. > > > > >> > > > > > > >> > > All we are asking for is to decomposing it > into > > > > >> > > > > > > >> > > def mxnet_to_onnx(module): > > > > >> > > nnvm_graph, params = > nnvm_from_mxnet(module) > > > > >> > > onnx = nnvm_to_onnx(nnvm_graph, params) > > > > >> > > return onnx > > > > >> > > > > > > >> > > This allows nnvm_from_mxnet to be reused for > other > > > purposes, > > > > >> like > > > > >> > > compiling API to deployable modules > > > > >> > > > > > > >> > > Tianqi > > > > >> > > > > > > >> > > On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, > Hagay < > > > > >> lupe...@gmail.com> > > > > >> > wrote: > > > > >> > > > > > > >> > >> Tianqi: > > > > >> > >> Thanks for detailing the trends. I fully > agree > > that ONNX > > > is > > > > >> just a > > > > >> > graph > > > > >> > >> serialization format – nothing more, nothing > less. > > I also > > > > >> think we > > > > >> > all > > > > >> > >> agree that this simple mechanism holds lots > of > > value to > > > DL > > > > >> users > > > > >> > since it > > > > >> > >> allows them to move between frameworks easily > > (e.g. train > > > > >> with > > > > >> > MXNet, > > > > >> > >> deploy on a mobile device with Caffe2, or the > > other way > > > > >> around). > > > > >> > >> As you said, In Memory IR is different than > > serialization > > > > >> formats > > > > >> > such as > > > > >> > >> ONNX. They are designed to make the runtime > > execution as > > > > >> efficient > > > > >> > as > > > > >> > >> possible, leveraging software and hardware > > optimizations. > > > > >> They are > > > > >> > indeed > > > > >> > >> complex, and where the “meat” is. > > > > >> > >> (BTW ONNX regards itself as an “IR” format, > but > > not in > > > the > > > > >> same > > > > >> > sense as > > > > >> > >> NNVM). > > > > >> > >> > > > > >> > >> At the end of the day, Roshani is aiming to > > deliver a > > > > simple > > > > >> > >> functionality to MXNet users: (1) take an > ONNX > > file, and > > > > >> load it > > > > >> > into MXNet > > > > >> > >> so you get a graph+weights you can work with > (2) > > Given a > > > > >> trained > > > > >> > model, > > > > >> > >> save it as an ONNX file. Since MXNet users > do not > > > interact > > > > >> with NNVM > > > > >> > >> directly, but rather interact with MXNet API > (MXNet > > > > Module), > > > > >> isn’t > > > > >> > the > > > > >> > >> simplest thing to do is just to construct the > > Module “on > > > > the > > > > >> fly” > > > > >> > using > > > > >> > >> MXNet API? Taking the other approach, we > will go > > from the > > > > >> top level > > > > >> > MXNet > > > > >> > >> “load” API, go “down” to NNVM to construct > the > > graph, go > > > > >> back up to > > > > >> > MXNet > > > > >> > >> to expose it as a Module. This seems to > complex > > and does > > > > not > > > > >> add any > > > > >> > >> benefit. In whatever way we construct the > MXNet > > Module > > > > >> object, NNVM > > > > >> > will > > > > >> > >> always be the underlying in memory IR that > is being > > > > >> executed, so > > > > >> > why not > > > > >> > >> take the simpler route? > > > > >> > >> > > > > >> > >> Hagay > > > > >> > >> > > > > >> > >> On 10/18/17, 19:42, "Tianqi Chen" < > > workc...@gmail.com on > > > > >> behalf of > > > > >> > >> tqc...@cs.washington.edu> wrote: > > > > >> > >> > > > > >> > >> Hi Chris: > > > > >> > >> > > > > >> > >> There is no intention to move things > away from > > mxnet. > > > > The > > > > >> > reduction of > > > > >> > >> lines of code by having a better design > in > > general, > > > and > > > > >> > usually, you > > > > >> > >> write > > > > >> > >> less redundant code by benefiting from > better > > design. > > > > As > > > > >> I may > > > > >> > quote: > > > > >> > >> "the > > > > >> > >> best design is not achieved not when > there is > > nothing > > > > to > > > > >> add, > > > > >> > but when > > > > >> > >> there is nothing to be taken away." > > > > >> > >> > > > > >> > >> MXNet has always benefited from this > > philosophy and > > > > >> improves > > > > >> > with the > > > > >> > >> new > > > > >> > >> designs and proper modularization. For > > example, we > > > see > > > > >> such > > > > >> > reduction > > > > >> > >> and > > > > >> > >> convenience happening when migrating from > > MXNet's > > > > legacy > > > > >> op to > > > > >> > the > > > > >> > >> NNVM's mechanism. The new mechanism now > enables > > > things > > > > >> like > > > > >> > sparse > > > > >> > >> aware > > > > >> > >> support and other stuff which would be > much > > harder to > > > > >> support. > > > > >> > >> > > > > >> > >> The nnvm/tvm stack comes brings the same > > benefit(if > > > not > > > > >> more) > > > > >> > and it > > > > >> > >> will > > > > >> > >> only add more features to MXNet itself. > > Offering more > > > > >> hardware > > > > >> > >> backends and > > > > >> > >> optimization, allowing us to write less > code > > and > > > spent > > > > >> less > > > > >> > time to > > > > >> > >> optimize for each backend by going > through TVM > > > > >> > >> > > > > >> > >> Tianqi > > > > >> > >> > > > > >> > >> On Wed, Oct 18, 2017 at 7:15 PM, Chris > Olivier > > < > > > > >> > cjolivie...@gmail.com > > > > >> > >> > > > > > >> > >> wrote: > > > > >> > >> > > > > >> > >> > Reduce code base of mxnet? By > increasing > > scope of > > > the > > > > >> dmlc > > > > >> > modules? > > > > >> > >> Is the > > > > >> > >> > intent to make mxnet a thin language > wrapper > > > around a > > > > >> group > > > > >> > of dmlc > > > > >> > >> > modules? > > > > >> > >> > > > > > >> > >> > > > > > >> > >> > On Wed, Oct 18, 2017 at 6:58 PM Tianqi > Chen < > > > > >> > >> tqc...@cs.washington.edu> > > > > >> > >> > wrote: > > > > >> > >> > > > > > >> > >> > > To better answer Hagay's question, I > would > > like > > > to > > > > >> dive > > > > >> > down a > > > > >> > >> bit deeper > > > > >> > >> > > on the relation between MXNet, NNVM > and > > model > > > > >> exchange > > > > >> > format > > > > >> > >> like ONNX. > > > > >> > >> > > > > > > >> > >> > > There are two major trends in deep > learning > > > systems > > > > >> now: > > > > >> > >> > > > > > > >> > >> > > - Common serializable formats, like > ONNX > > and > > > > CoreML, > > > > >> that > > > > >> > defines > > > > >> > >> the > > > > >> > >> > model > > > > >> > >> > > exchange format. > > > > >> > >> > > - The in-memory graph IR for quick > > optimization > > > and > > > > >> JIT. > > > > >> > NNVM, > > > > >> > >> > Tensorflow's > > > > >> > >> > > XLA falls into this category. > > > > >> > >> > > > > > > >> > >> > > The exchange formats are great, it > only > > poses a > > > > >> layer of > > > > >> > >> conversion, > > > > >> > >> > which > > > > >> > >> > > is good for exchange. The real meat > still > > comes > > > > from > > > > >> the > > > > >> > >> compilation and > > > > >> > >> > > JIT pipeline you have to offer. For > that, > > we will > > > > >> need an > > > > >> > >> in-memory IR, > > > > >> > >> > > because of the cost of constructing, > > serialize > > > > could > > > > >> be > > > > >> > high for > > > > >> > >> the > > > > >> > >> > > exchange formats like protobuf. And > > usually, the > > > > >> exchange > > > > >> > >> formats are > > > > >> > >> > > designed in a minimalistic fashion, > making > > it > > > less > > > > >> easy to > > > > >> > extend > > > > >> > >> more > > > > >> > >> > > information to support in-depth > > optimization like > > > > >> automatic > > > > >> > >> quantization, > > > > >> > >> > > accelerator support. > > > > >> > >> > > > > > > >> > >> > > The current MXNet relies on NNVM for > > in-memory IR > > > > >> > manipulation > > > > >> > >> but does > > > > >> > >> > not > > > > >> > >> > > contain a compilation component that > > compiles to > > > > the > > > > >> > hardware > > > > >> > >> backends. > > > > >> > >> > > Doing export to an exchange format > and > > then back > > > > >> into NNVM > > > > >> > run the > > > > >> > >> > > compilation poses too much burden > that JIT > > > compiler > > > > >> could > > > > >> > pay. > > > > >> > >> Using the > > > > >> > >> > > same in-memory graph IR as the > compilation > > stack > > > > >> give much > > > > >> > more > > > > >> > >> advantage > > > > >> > >> > > in terms of this. > > > > >> > >> > > > > > > >> > >> > > The newly introduces nnvm/top and > compiler > > offers > > > > >> in-memory > > > > >> > graph > > > > >> > >> > > optimization and compilation and > offers > > more > > > > hardware > > > > >> > backend > > > > >> > >> directly > > > > >> > >> > via > > > > >> > >> > > TVM. We already see promising > results in > > edge > > > > >> deployments > > > > >> > with a > > > > >> > >> much > > > > >> > >> > lower > > > > >> > >> > > overhead of runtime. We will further > > benefit > > > > quickly > > > > >> from > > > > >> > more > > > > >> > >> graph > > > > >> > >> > > optimizations that it has to offer. > > > > >> > >> > > > > > > >> > >> > > Building support around this new > paradigm > > offers > > > us > > > > >> > advantage of > > > > >> > >> being > > > > >> > >> > > future compatible and takes full > benefit > > of the > > > > >> points I > > > > >> > >> mentioned above > > > > >> > >> > > > > > > >> > >> > > Tianqi > > > > >> > >> > > > > > > >> > >> > > > > > > >> > >> > > > > > > >> > >> > > On Wed, Oct 18, 2017 at 4:57 PM, > Lupesko, > > Hagay < > > > > >> > >> lupe...@gmail.com> > > > > >> > >> > wrote: > > > > >> > >> > > > > > > >> > >> > > > Roshani – this is an exciting > > initiative, ONNX > > > > >> support on > > > > >> > MXNet > > > > >> > >> will > > > > >> > >> > > > enable more users to ramp up on > MXNet, > > which is > > > > >> great. > > > > >> > >> > > > > > > > >> > >> > > > Tianqi – a few questions and > thoughts > > about > > > your > > > > >> note: > > > > >> > >> > > > - “More hardware backends to > mxnet” – > > MXNet > > > users > > > > >> get the > > > > >> > same > > > > >> > >> benefit > > > > >> > >> > of > > > > >> > >> > > > HW support implementing ONNX > import on > > top of > > > > MXNet > > > > >> > symbolic, > > > > >> > >> right? > > > > >> > >> > > > - “NNVM Compiler now received > > contributions > > > from > > > > >> AWS, UW > > > > >> > and > > > > >> > >> many other > > > > >> > >> > > > folks in MXNet community.” – > agreed it is > > > ramping > > > > >> up, but > > > > >> > when > > > > >> > >> you look > > > > >> > >> > > at > > > > >> > >> > > > the data, it is clear that it is > very > > early on > > > > for > > > > >> NNVM. > > > > >> > >> Looking at the > > > > >> > >> > > > repo, it has overall 223 commits, 0 > > releases. > > > > >> Compare it > > > > >> > to > > > > >> > >> MXNet with > > > > >> > >> > > 6136 > > > > >> > >> > > > commits and 32 releases. It seems > to be > > still > > > > >> early on for > > > > >> > >> NNVM, and > > > > >> > >> > for > > > > >> > >> > > a > > > > >> > >> > > > more reliable initial > implementation > > building > > > the > > > > >> import > > > > >> > on top > > > > >> > >> of > > > > >> > >> > MXNet > > > > >> > >> > > is > > > > >> > >> > > > easier, faster and safer. MXNet > has lots > > of > > > users > > > > >> already > > > > >> > using > > > > >> > >> the > > > > >> > >> > > > Symbolic API which hopefully mean > that > > is a > > > > mature > > > > >> API > > > > >> > that is > > > > >> > >> not > > > > >> > >> > likely > > > > >> > >> > > > to have breaking changes or major > issues. > > > > >> > >> > > > > > > > >> > >> > > > I’m supportive option 1 proposed by > > Roshani > > > > >> (building > > > > >> > serde on > > > > >> > >> top of > > > > >> > >> > > > MXNet symbolic), but to do it as an > > > encapsulated > > > > >> > implementation > > > > >> > >> detail, > > > > >> > >> > > so > > > > >> > >> > > > the implementation can be migrated > to > > NNVM or > > > > >> another > > > > >> > >> implementation in > > > > >> > >> > > the > > > > >> > >> > > > future, if at that point it seems > like > > the > > > right > > > > >> thing to > > > > >> > do. > > > > >> > >> > > > > > > > >> > >> > > > Interested in hearing other > opinions > > though… > > > > >> > >> > > > > > > > >> > >> > > > Hagay > > > > >> > >> > > > > > > > >> > >> > > > On 10/18/17, 14:13, "Tianqi Chen" < > > > > >> workc...@gmail.com on > > > > >> > >> behalf of > > > > >> > >> > > > tqc...@cs.washington.edu> wrote: > > > > >> > >> > > > > > > > >> > >> > > > I am strongly recommending > going > > through > > > the > > > > >> > nnvm/top. One > > > > >> > >> major > > > > >> > >> > > > reason in > > > > >> > >> > > > here is that the support of > nnvm/top > > layer > > > > NOT > > > > >> ONLY > > > > >> > mean > > > > >> > >> > > compatibility > > > > >> > >> > > > of > > > > >> > >> > > > model format with onnx. These > are > > the major > > > > >> benefits: > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > - More hardware backends to > mxnet, > > > including > > > > >> opencl, > > > > >> > metal, > > > > >> > >> > Raspberry > > > > >> > >> > > > Pi, > > > > >> > >> > > > web browser. These things are > > automatically > > > > >> enabled > > > > >> > by going > > > > >> > >> > through > > > > >> > >> > > > this > > > > >> > >> > > > layer. In general, we design > > nnvm/tvm stack > > > > to > > > > >> > resolve the > > > > >> > >> > challenge > > > > >> > >> > > of > > > > >> > >> > > > current mxnet's weakness in > terms > > deploying > > > > to > > > > >> more > > > > >> > hardware > > > > >> > >> > > backends. > > > > >> > >> > > > > > > > >> > >> > > > - More frontend capabilities, > nnvm's > > gluon > > > > >> style IR > > > > >> > ingests > > > > >> > >> now > > > > >> > >> > from > > > > >> > >> > > > CoreML, ONNX and in future > keras. > > > Supporting > > > > >> those > > > > >> > will > > > > >> > >> reduce the > > > > >> > >> > > > amount > > > > >> > >> > > > of engineering effort needed. > > > > >> > >> > > > > > > > >> > >> > > > - Future compatibility. We all > agree > > that > > > the > > > > >> future > > > > >> > being > > > > >> > >> migrated > > > > >> > >> > > to > > > > >> > >> > > > gluon's API. NNVM/top tries to > look > > ahead > > > by > > > > >> directly > > > > >> > >> adopting the > > > > >> > >> > > > symbolic > > > > >> > >> > > > API to be gluon. > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > I would also like to correct > some of > > the > > > > >> mentioned > > > > >> > facts > > > > >> > >> with > > > > >> > >> > regard > > > > >> > >> > > to > > > > >> > >> > > > nnvm/tvm stack > > > > >> > >> > > > > > > > >> > >> > > > 1. Nascent project with few > > contributors > > > > >> > >> > > > > > > > >> > >> > > > NNVM Compiler now received > > contributions > > > from > > > > >> AWS, UW > > > > >> > and > > > > >> > >> many > > > > >> > >> > other > > > > >> > >> > > > folks > > > > >> > >> > > > in MXNet community. NNVM > itself is > > already > > > > >> being used > > > > >> > by > > > > >> > >> MXNet. > > > > >> > >> > > > MXNet's internal IR is > migrating > > toward > > > > gluon, > > > > >> and its > > > > >> > >> final form > > > > >> > >> > > being > > > > >> > >> > > > nnvm/top > > > > >> > >> > > > > > > > >> > >> > > > 3. Does not support all > operators > > that > > > > exist > > > > >> in > > > > >> > MXNet > > > > >> > >> Symbolic > > > > >> > >> > API > > > > >> > >> > > > > > > > >> > >> > > > Neither NNVM/top or onnx > support all > > > > operators > > > > >> that > > > > >> > exist > > > > >> > >> in mxnet > > > > >> > >> > > > symbolic > > > > >> > >> > > > API. The end goal here is > mainly to > > make > > > > >> nnvm/top onnx > > > > >> > >> compatible, > > > > >> > >> > > > which is > > > > >> > >> > > > a more reasonable goal. > > > > >> > >> > > > > > > > >> > >> > > > 4. No CI Pipeline and > testcases > > > > >> > >> > > > > > > > >> > >> > > > NNVM already contains a > compiler > > contains > > > > >> unittests > > > > >> > and ci > > > > >> > >> tested > > > > >> > >> > > with > > > > >> > >> > > > integration > > https://github.com/dmlc/nnvm, > > > > >> with a CI > > > > >> > >> pipline that > > > > >> > >> > is > > > > >> > >> > > > well > > > > >> > >> > > > tested on CPU and GPU cases for > > front-ends. > > > > >> > >> > > > > > > > >> > >> > > > Tianqi > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > On Wed, Oct 18, 2017 at 1:41 > PM, > > Roshani > > > > >> Nagmote < > > > > >> > >> > > > roshaninagmo...@gmail.com> > > > > >> > >> > > > wrote: > > > > >> > >> > > > > > > > >> > >> > > > > Hi guys, > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > I am working on supporting > ONNX < > > > > >> > >> https://github.com/onnx/onnx> > > > > >> > >> > > > pre-trained > > > > >> > >> > > > > models in Apache MXNet and > would > > like to > > > > >> seek your > > > > >> > >> opinion on the > > > > >> > >> > > > choice of > > > > >> > >> > > > > implementation. I also have > > created a > > > > GitHub > > > > >> issue > > > > >> > >> > > > > <https://github.com/apache/ > > > > >> > incubator-mxnet/issues/8319>. > > > > >> > >> > > Supporting > > > > >> > >> > > > ONNX > > > > >> > >> > > > > in > > > > >> > >> > > > > MXNet will enable users to > move > > between > > > > >> frameworks > > > > >> > with > > > > >> > >> their > > > > >> > >> > > > models, this > > > > >> > >> > > > > will also enable MXNet > project to > > be a > > > part > > > > >> of the > > > > >> > ONNX > > > > >> > >> open > > > > >> > >> > > > standard and > > > > >> > >> > > > > steer the direction of ONNX. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > For those who don’t know > ONNX, > > ONNX is an > > > > >> open > > > > >> > source > > > > >> > >> format for > > > > >> > >> > AI > > > > >> > >> > > > models > > > > >> > >> > > > > which enables models to be > > transferred > > > > >> between > > > > >> > >> frameworks. Refer > > > > >> > >> > to > > > > >> > >> > > > > https://github.com/onnx/onnx > for > > more > > > > >> details. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > To implement the > import/export > > > > functionality > > > > >> in > > > > >> > MXNet, I > > > > >> > >> propose > > > > >> > >> > to > > > > >> > >> > > > expose > > > > >> > >> > > > > a MXNet python module > “serde”(name > > taken > > > > from > > > > >> > Apache Hive > > > > >> > >> > project) > > > > >> > >> > > > with the > > > > >> > >> > > > > following methods supporting > > different > > > > >> formats: > > > > >> > >> > > > > > > > > >> > >> > > > > sym, params = > > > > mxnet.serde.import(other_forma > > > > >> t_file, > > > > >> > >> > > > other_format=‘onnx’) > > > > >> > >> > > > > > > > > >> > >> > > > > other_format_file = > > > > >> mxnet.serde.export(mxnet_sym, > > > > >> > >> mxnet_params, > > > > >> > >> > > > ‘onnx’) > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > The implementation under the > hood > > can be > > > > >> done in > > > > >> > two ways: > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > 1) Implement at the MXNet > layer by > > > parsing > > > > >> the ONNX > > > > >> > >> model(in > > > > >> > >> > > protobuf > > > > >> > >> > > > > format) and turn into MXNet > > Symbolic > > > > >> operators and > > > > >> > build > > > > >> > >> MXNet > > > > >> > >> > > model > > > > >> > >> > > > > directly. Similarly, I can > convert > > the > > > > MXNet > > > > >> model > > > > >> > to > > > > >> > >> ONNX format > > > > >> > >> > > at > > > > >> > >> > > > this > > > > >> > >> > > > > layer. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > 2) The DMLC community has > released > > the > > > > >> nnvm/tvm > > > > >> > complier > > > > >> > >> and an > > > > >> > >> > > > > intermediate representation > of the > > > models, > > > > >> refer: > > > > >> > >> > > > > http://www.tvmlang.org/2017/ > > > > >> > 10/06/nnvm/tvm-compiler- > > > > >> > >> > > > announcement.html > > > > >> > >> > > > > < > http://www.tvmlang.org/2017/1 > > > > >> 0/06/nnvm-compiler- > > > > >> > >> > announcement.html > > > > >> > >> > > > > > > > >> > >> > > > > > > > > >> > >> > > > > Based on the conversation on > the > > GitHub > > > > issue > > > > >> > >> > > > > <https://github.com/apache/ > > > > >> > incubator-mxnet/issues/8319> I > > > > >> > >> > opened, > > > > >> > >> > > Mu > > > > >> > >> > > > > mentioned that MXNet would > use > > nnvm/tvm > > > as > > > > >> the > > > > >> > backend in > > > > >> > >> the > > > > >> > >> > > future. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > We could hook into this > layer to > > > implement > > > > >> the > > > > >> > >> import/export > > > > >> > >> > > > functionality. > > > > >> > >> > > > > nnvm/tvm has ONNX 0.1 version > > import > > > > >> implemented. > > > > >> > >> > > > > > > > > >> > >> > > > > For import, > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > I will need to enhance > > nnvm/tvm’s > > > > >> importer to > > > > >> > support > > > > >> > >> ONNX 0.2 > > > > >> > >> > > > > 2. > > > > >> > >> > > > > > > > > >> > >> > > > > Implement nnvm/tvm->mxnet > > symbolic > > > > >> operators. > > > > >> > >> > > > > > > > > >> > >> > > > > For export: > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > mxnet->nnvm/tvm ( nnvm/tvm > > provides > > > this > > > > >> > implementation > > > > >> > >> > already) > > > > >> > >> > > > > 2. > > > > >> > >> > > > > > > > > >> > >> > > > > I will need to Implement > > > nnvm/tvm>onnx. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > These are the pros and cons > I see > > in the > > > > >> above > > > > >> > approaches: > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > Import/export at mxnet > layer > > > > >> > >> > > > > > > > > >> > >> > > > > Pros: > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > Stable APIs currently > used by > > users. > > > > >> > >> > > > > 2. > > > > >> > >> > > > > > > > > >> > >> > > > > Larger Apache MXNet > community of > > > > >> contributors. > > > > >> > >> > > > > 3. > > > > >> > >> > > > > > > > > >> > >> > > > > CI pipeline to catch bugs. > > > > >> > >> > > > > 4. > > > > >> > >> > > > > > > > > >> > >> > > > > Comparatively less time to > > implement > > > and > > > > >> put it > > > > >> > in the > > > > >> > >> hands > > > > >> > >> > of > > > > >> > >> > > > the > > > > >> > >> > > > > users. > > > > >> > >> > > > > > > > > >> > >> > > > > Cons: > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > In the future we may have > to > > > reimplement > > > > >> at the > > > > >> > >> nnvm/tvm > > > > >> > >> > layer, > > > > >> > >> > > > in case > > > > >> > >> > > > > MXNet moves to the > nnvm/tvm > > > > >> backend(assuming it > > > > >> > will > > > > >> > >> move). > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > Import/export at nnvm/tvm > layer > > > > >> > >> > > > > > > > > >> > >> > > > > Pros: > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > Less engineering work in > case > > mxnet > > > > moves > > > > >> to > > > > >> > nnvm/tvm > > > > >> > >> > > > > 2. > > > > >> > >> > > > > > > > > >> > >> > > > > nnvm/tvm would become a > hub to > > convert > > > > to > > > > >> > different > > > > >> > >> formats. > > > > >> > >> > > > > 3. > > > > >> > >> > > > > > > > > >> > >> > > > > nnvm operators are more in > > parity with > > > > >> mxnet’s > > > > >> > gluon > > > > >> > >> APIs this > > > > >> > >> > > > could be > > > > >> > >> > > > > useful in case Gluon > becomes > > the only > > > > >> standard > > > > >> > that > > > > >> > >> MXNet will > > > > >> > >> > > > support. > > > > >> > >> > > > > > > > > >> > >> > > > > Cons: > > > > >> > >> > > > > > > > > >> > >> > > > > 1. > > > > >> > >> > > > > > > > > >> > >> > > > > Nascent project with few > > contributors > > > > >> > >> > > > > 2. > > > > >> > >> > > > > > > > > >> > >> > > > > Does not support all > operators > > that > > > > exist > > > > >> in > > > > >> > MXNet > > > > >> > >> Symbolic > > > > >> > >> > API > > > > >> > >> > > > > 3. > > > > >> > >> > > > > > > > > >> > >> > > > > No CI Pipeline > > > > >> > >> > > > > 4. > > > > >> > >> > > > > > > > > >> > >> > > > > Current Apache MXNet > project > > does not > > > > use > > > > >> > nnvm/tvm > > > > >> > >> backend > > > > >> > >> > > > > 5. > > > > >> > >> > > > > > > > > >> > >> > > > > mxnet->nnvm/tvm backend > needs > > more > > > > >> testing and > > > > >> > user > > > > >> > >> feedback. > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > Any suggestions on both of > these > > > > approaches? > > > > >> From > > > > >> > user's > > > > >> > >> > > > perspective, this > > > > >> > >> > > > > will be an implementation > detail > > that is > > > > not > > > > >> > exposed. > > > > >> > >> > > > > > > > > >> > >> > > > > Thanks, > > > > >> > >> > > > > > > > > >> > >> > > > > Roshani > > > > >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >