Re: Request for suggestions- Supporting onnx in mxnet

Hen Thu, 19 Oct 2017 12:04:05 -0700

What I think I'm seeing here is that:

* MXNet moved to Apache.
* Some of the code it relied on (50% per the last release thread, but that
may have been bombastic) remained at DMLC.
* The MXNet community thinks one thing.
* The DMLC community (which is a subset of the MXNet community that runs
under different community rules) thinks another.


Something is rotten.

One solution: The MXNet community forks the DMLC code it relies on into the
MXNet codebase and moves on without being tied down by the decisions of a
non-compatible community.

Hen



On Thu, Oct 19, 2017 at 11:59 AM, Tianqi Chen <[email protected]>
wrote:

> Here are the detailed points(sorry for resenting it over again)
>
> Technical Reasoning:
>
>  - Model exchange format like CoreML and ONNX are not lossless and
> complete. They are designed to an contain a core set of the
> minimum operators to support necessary inference tasks like ResNet, etc.
> So you cannot rely on a bi-directional serialization with this format for
> all MXNet models.  As a simple example, broadcast add/mul is simply not
> supported in onnx.
>
> - Same problem goes for compilation and in-memory IR, a core set of most
> interesting primitives are effectively supported.
>
> - Either in the case of supporting exchange format, or in-memory IR, we
> need to make the decision on what core set of operators are we interested
> in support.  We cannot simply say let us support everything from the
> beginning due to the limitations of the exchange format.
>
> - It is crucial for us articulate what is the core set of operators we care
> about in MXNet. Either in terms of providing guidelines to the community,
> or influence the design of model exchange format them-selfs to move in
> favor of MXNet.
>
> - nnvm/top is that initial core set of operators for both compiler support
> and exchange purposes. It is modeled under numpy and gluon, under the
> supervision of Eric, Me and Mu.  It can be bi-directionally exchanged with
> a current mxnet operator without loss of information.
>
> The Effort of Engineering:
>
> - Because nnvm/top is modeled with numpy and gluon, mxnet<-> nnvm/top is
> quite easy, and we already have one direction done. I would be very happy
> to answer any questions on another. No information loss will happen with
> this path.
>
> - mxnet/symbol or nnvm/symbol(they are essentially the same thing with a
> bit different op defs) <- onnx is harder. There has been already enough
> effort to support onnx 0.1 as Roshani mentioned. Which is contributed by
> Zhi Zhang, another Apache MXNet committer. Zhi already provided code to
> alleviate this process. Built code on the existing effort would actually
> make the problem easier.
>
> On Thu, Oct 19, 2017 at 11:55 AM, Tianqi Chen <[email protected]>
> wrote:
>
> > As for where the code should sit, we have seen onnx's support for caffe2
> > sitting on a separate repo.  My suggestion would be put code under
> nnvm/top
> > and migrate into mxnet eventually when the top components get into MXNet,
> > hopefully by end of next month.
> >
> > I have elaborated my point in the last email thread. This (going through
> > nnvm/top) is an important design decision both technically (compilation,
> > more hardware) and strategically (articulate our core set of operators
> and
> > influence the model exchange format).
> >
> > I am glad to see the discussion happening and surely there is doubt, as
> > with every big step of changes.  But with the rapidly changing pace of
> deep
> > learning systems, this is the direction that we thought is most
> promising.
> > We can call for a vote if necessary among the committers for the design
> > decision if there is still debate on this issue. Or we can keep the
> > discussion open and start some effort around nnvm/top to see how it goes
> >
> > Tianqi
> >
> > On Thu, Oct 19, 2017 at 11:15 AM, Lupesko, Hagay <[email protected]>
> > wrote:
> >
> >> Mu,
> >>
> >> You’re mentioning plans for a new model format and compiler, but I don’t
> >> recall seeing it shared/discussed on the dev list. Can you share these,
> so
> >> it is more accessible to folks to understand the plan and vision?
> >>
> >> Personally, I think it will be a shame to add ONNX support to MXNet, and
> >> have it implemented outside of MXNet. At the end of the day, it makes
> >> things difficult for MXNet users.
> >>
> >> Hagay
> >>
> >> On 10/19/17, 10:01, "Mu Li" <[email protected] on behalf of
> >> [email protected]> wrote:
> >>
> >>     I'm speaking under my "MXNet contributor" hat.
> >>
> >>     It will be sad that our new model format and compiler is not
> >> supported by
> >>     our own contributors. It puts us in a bad position to reach out to
> >> outside
> >>     to ask for support.
> >>
> >>     If you really what to do it with the onnx <-> mxnet way, I suggest
> >> putting
> >>     the codes under https://github.com/aws.
> >>
> >>     Best
> >>     Mu
> >>
> >>     On Thu, Oct 19, 2017 at 9:51 AM, Lupesko, Hagay <[email protected]>
> >> wrote:
> >>
> >>     > Since there seems to be a difficulty to reach a consensus here,
> and
> >> this
> >>     > is a new area, maybe a good compromise would be to contribute this
> >> under
> >>     > /contrib as experimental, with whatever way Roshani thinks makes
> >> sense.
> >>     > Once there is code in place, and MXNet users and contributors are
> >> able to
> >>     > check it out, we can consider future steps.
> >>     >
> >>     > Does this proposal make sense to folks?
> >>     >
> >>     > On 10/18/17, 23:01, "Tianqi Chen" <[email protected] on behalf
> of
> >>     > [email protected]> wrote:
> >>     >
> >>     >     I want to offer one last thing in terms of technical details.
> I
> >>     > mentioned
> >>     >     two trends in the deep learning systems. There is one last
> >> thing that
> >>     > is
> >>     >     omitted. How should we build a good deploy end for deep
> learning
> >>     > models.
> >>     >
> >>     >     There is always a paradox to this problem:
> >>     >
> >>     >     - On one hand, the deployment end needs to be lightweight and
> >> portable.
> >>     >     - We want a lot of optimizations (memory layout compute) and
> >> feature
> >>     >     support, this makes the project big.
> >>     >
> >>     >     All the existing systems suffer from this problem. The
> solution
> >> is
> >>     > simple,
> >>     >     separates the optimization part from the actual runtime and
> >> compiles
> >>     > the
> >>     >     things down to a bare metal module. And this is the solution
> >> nnvm/top
> >>     >     compiler pipeline offer, which I believe will become a
> standard
> >>     > practice of
> >>     >     deployment and where all systems go to
> >>     >
> >>     >     Tianqi
> >>     >
> >>     >     On Wed, Oct 18, 2017 at 10:03 PM, Tianqi Chen <
> >>     > [email protected]>
> >>     >     wrote:
> >>     >
> >>     >     > OK, there is some miscommunication in here I guess.  We only
> >> need to
> >>     > do a
> >>     >     > "canonization" step in python API that goes a symbol to
> symbol
> >>     > translation
> >>     >     > layer. It can be done in purely in python, and there is no
> >> need for
> >>     > going
> >>     >     > "down" into c++ to do this.
> >>     >     >
> >>     >     > For example, the current nnvm.from_mxnet API takes Module or
> >> Gluon
> >>     > module
> >>     >     > and get you back nnvm/top graph in python.
> >>     >     >
> >>     >     > All we are asking for is to decomposing it into
> >>     >     >
> >>     >     > def mxnet_to_onnx(module):
> >>     >     >    nnvm_graph, params = nnvm_from_mxnet(module)
> >>     >     >    onnx = nnvm_to_onnx(nnvm_graph, params)
> >>     >     >    return onnx
> >>     >     >
> >>     >     > This allows nnvm_from_mxnet to be reused for other purposes,
> >> like
> >>     >     > compiling API to deployable modules
> >>     >     >
> >>     >     > Tianqi
> >>     >     >
> >>     >     > On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay <
> >> [email protected]>
> >>     > wrote:
> >>     >     >
> >>     >     >> Tianqi:
> >>     >     >> Thanks for detailing the trends. I fully agree that ONNX is
> >> just a
> >>     > graph
> >>     >     >> serialization format – nothing more, nothing less. I also
> >> think we
> >>     > all
> >>     >     >> agree that this simple mechanism holds lots of value to DL
> >> users
> >>     > since it
> >>     >     >> allows them to move between frameworks easily (e.g. train
> >> with
> >>     > MXNet,
> >>     >     >> deploy on a mobile device with Caffe2, or the other way
> >> around).
> >>     >     >> As you said, In Memory IR is different than serialization
> >> formats
> >>     > such as
> >>     >     >> ONNX. They are designed to make the runtime execution as
> >> efficient
> >>     > as
> >>     >     >> possible, leveraging software and hardware optimizations.
> >> They are
> >>     > indeed
> >>     >     >> complex, and where the “meat” is.
> >>     >     >> (BTW ONNX regards itself as an “IR” format, but not in the
> >> same
> >>     > sense as
> >>     >     >> NNVM).
> >>     >     >>
> >>     >     >> At the end of the day, Roshani is aiming to deliver a
> simple
> >>     >     >> functionality to MXNet users: (1) take an ONNX file, and
> >> load it
> >>     > into MXNet
> >>     >     >> so you get a graph+weights you can work with (2) Given a
> >> trained
> >>     > model,
> >>     >     >> save it as an ONNX file. Since MXNet users do not interact
> >> with NNVM
> >>     >     >> directly, but rather interact with MXNet API (MXNet
> Module),
> >> isn’t
> >>     > the
> >>     >     >> simplest thing to do is just to construct the Module “on
> the
> >> fly”
> >>     > using
> >>     >     >> MXNet API? Taking the other approach, we will go from the
> >> top level
> >>     > MXNet
> >>     >     >> “load” API, go “down” to NNVM to construct the graph, go
> >> back up to
> >>     > MXNet
> >>     >     >> to expose it as a Module. This seems to complex and does
> not
> >> add any
> >>     >     >> benefit. In whatever way we construct the MXNet Module
> >> object, NNVM
> >>     > will
> >>     >     >> always be the underlying in memory IR that is being
> >> executed, so
> >>     > why not
> >>     >     >> take the simpler route?
> >>     >     >>
> >>     >     >> Hagay
> >>     >     >>
> >>     >     >> On 10/18/17, 19:42, "Tianqi Chen" <[email protected] on
> >> behalf of
> >>     >     >> [email protected]> wrote:
> >>     >     >>
> >>     >     >>     Hi Chris:
> >>     >     >>
> >>     >     >>     There is no intention to move things away from mxnet.
> The
> >>     > reduction of
> >>     >     >>     lines of code by having a better design in general, and
> >>     > usually, you
> >>     >     >> write
> >>     >     >>     less redundant code by benefiting from better design.
> As
> >> I may
> >>     > quote:
> >>     >     >> "the
> >>     >     >>     best design is not achieved not when there is nothing
> to
> >> add,
> >>     > but when
> >>     >     >>     there is nothing to be taken away."
> >>     >     >>
> >>     >     >>     MXNet has always benefited from this philosophy and
> >> improves
> >>     > with the
> >>     >     >> new
> >>     >     >>     designs and proper modularization. For example, we see
> >> such
> >>     > reduction
> >>     >     >> and
> >>     >     >>     convenience happening when migrating from MXNet's
> legacy
> >> op to
> >>     > the
> >>     >     >>     NNVM's mechanism. The new mechanism now enables things
> >> like
> >>     > sparse
> >>     >     >> aware
> >>     >     >>     support and other stuff which would be much harder to
> >> support.
> >>     >     >>
> >>     >     >>     The nnvm/tvm stack comes brings the same benefit(if not
> >> more)
> >>     > and it
> >>     >     >> will
> >>     >     >>     only add more features to MXNet itself. Offering more
> >> hardware
> >>     >     >> backends and
> >>     >     >>     optimization, allowing us to write less code and spent
> >> less
> >>     > time to
> >>     >     >>     optimize for each backend by going through TVM
> >>     >     >>
> >>     >     >>     Tianqi
> >>     >     >>
> >>     >     >>     On Wed, Oct 18, 2017 at 7:15 PM, Chris Olivier <
> >>     > [email protected]
> >>     >     >> >
> >>     >     >>     wrote:
> >>     >     >>
> >>     >     >>     > Reduce code base of mxnet? By increasing scope of the
> >> dmlc
> >>     > modules?
> >>     >     >> Is the
> >>     >     >>     > intent to make mxnet a thin language wrapper around a
> >> group
> >>     > of dmlc
> >>     >     >>     > modules?
> >>     >     >>     >
> >>     >     >>     >
> >>     >     >>     > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen <
> >>     >     >> [email protected]>
> >>     >     >>     > wrote:
> >>     >     >>     >
> >>     >     >>     > > To better answer Hagay's question, I would like to
> >> dive
> >>     > down a
> >>     >     >> bit deeper
> >>     >     >>     > > on the relation between MXNet, NNVM and model
> >> exchange
> >>     > format
> >>     >     >> like ONNX.
> >>     >     >>     > >
> >>     >     >>     > > There are two major trends in deep learning systems
> >> now:
> >>     >     >>     > >
> >>     >     >>     > > - Common serializable formats, like ONNX and
> CoreML,
> >> that
> >>     > defines
> >>     >     >> the
> >>     >     >>     > model
> >>     >     >>     > > exchange format.
> >>     >     >>     > > - The in-memory graph IR for quick optimization and
> >> JIT.
> >>     > NNVM,
> >>     >     >>     > Tensorflow's
> >>     >     >>     > > XLA falls into this category.
> >>     >     >>     > >
> >>     >     >>     > > The exchange formats are great, it only poses a
> >> layer of
> >>     >     >> conversion,
> >>     >     >>     > which
> >>     >     >>     > > is good for exchange. The real meat still comes
> from
> >> the
> >>     >     >> compilation and
> >>     >     >>     > > JIT pipeline you have to offer. For that, we will
> >> need an
> >>     >     >> in-memory IR,
> >>     >     >>     > > because of the cost of constructing, serialize
> could
> >> be
> >>     > high for
> >>     >     >> the
> >>     >     >>     > > exchange formats like protobuf.  And usually, the
> >> exchange
> >>     >     >> formats are
> >>     >     >>     > > designed in a minimalistic fashion, making it less
> >> easy to
> >>     > extend
> >>     >     >> more
> >>     >     >>     > > information to support in-depth optimization like
> >> automatic
> >>     >     >> quantization,
> >>     >     >>     > > accelerator support.
> >>     >     >>     > >
> >>     >     >>     > > The current MXNet relies on NNVM for in-memory IR
> >>     > manipulation
> >>     >     >> but does
> >>     >     >>     > not
> >>     >     >>     > > contain a compilation component that compiles to
> the
> >>     > hardware
> >>     >     >> backends.
> >>     >     >>     > > Doing export to an exchange format and then back
> >> into NNVM
> >>     > run the
> >>     >     >>     > > compilation poses too much burden that JIT compiler
> >> could
> >>     > pay.
> >>     >     >> Using the
> >>     >     >>     > > same in-memory graph IR as the compilation stack
> >> give much
> >>     > more
> >>     >     >> advantage
> >>     >     >>     > > in terms of this.
> >>     >     >>     > >
> >>     >     >>     > > The newly introduces nnvm/top and compiler offers
> >> in-memory
> >>     > graph
> >>     >     >>     > > optimization and compilation and offers more
> hardware
> >>     > backend
> >>     >     >> directly
> >>     >     >>     > via
> >>     >     >>     > > TVM. We already see promising results in edge
> >> deployments
> >>     > with a
> >>     >     >> much
> >>     >     >>     > lower
> >>     >     >>     > > overhead of runtime. We will further benefit
> quickly
> >> from
> >>     > more
> >>     >     >> graph
> >>     >     >>     > > optimizations that it has to offer.
> >>     >     >>     > >
> >>     >     >>     > > Building support around this new paradigm offers us
> >>     > advantage of
> >>     >     >> being
> >>     >     >>     > > future compatible and takes full benefit of the
> >> points I
> >>     >     >> mentioned above
> >>     >     >>     > >
> >>     >     >>     > > Tianqi
> >>     >     >>     > >
> >>     >     >>     > >
> >>     >     >>     > >
> >>     >     >>     > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko, Hagay <
> >>     >     >> [email protected]>
> >>     >     >>     > wrote:
> >>     >     >>     > >
> >>     >     >>     > > > Roshani – this is an exciting initiative, ONNX
> >> support on
> >>     > MXNet
> >>     >     >> will
> >>     >     >>     > > > enable more users to ramp up on MXNet, which is
> >> great.
> >>     >     >>     > > >
> >>     >     >>     > > > Tianqi – a few questions and thoughts about your
> >> note:
> >>     >     >>     > > > - “More hardware backends to mxnet” – MXNet users
> >> get the
> >>     > same
> >>     >     >> benefit
> >>     >     >>     > of
> >>     >     >>     > > > HW support implementing ONNX import on top of
> MXNet
> >>     > symbolic,
> >>     >     >> right?
> >>     >     >>     > > > - “NNVM Compiler now received contributions from
> >> AWS, UW
> >>     > and
> >>     >     >> many other
> >>     >     >>     > > > folks in MXNet community.” – agreed it is ramping
> >> up, but
> >>     > when
> >>     >     >> you look
> >>     >     >>     > > at
> >>     >     >>     > > > the data, it is clear that it is very early on
> for
> >> NNVM.
> >>     >     >> Looking at the
> >>     >     >>     > > > repo, it has overall 223 commits, 0 releases.
> >> Compare it
> >>     > to
> >>     >     >> MXNet with
> >>     >     >>     > > 6136
> >>     >     >>     > > > commits and 32 releases. It seems to be still
> >> early on for
> >>     >     >> NNVM, and
> >>     >     >>     > for
> >>     >     >>     > > a
> >>     >     >>     > > > more reliable initial implementation building the
> >> import
> >>     > on top
> >>     >     >> of
> >>     >     >>     > MXNet
> >>     >     >>     > > is
> >>     >     >>     > > > easier, faster and safer. MXNet has lots of users
> >> already
> >>     > using
> >>     >     >> the
> >>     >     >>     > > > Symbolic API which hopefully mean that is a
> mature
> >> API
> >>     > that is
> >>     >     >> not
> >>     >     >>     > likely
> >>     >     >>     > > > to have breaking changes or major issues.
> >>     >     >>     > > >
> >>     >     >>     > > > I’m supportive option 1 proposed by Roshani
> >> (building
> >>     > serde on
> >>     >     >> top of
> >>     >     >>     > > > MXNet symbolic), but to do it as an encapsulated
> >>     > implementation
> >>     >     >> detail,
> >>     >     >>     > > so
> >>     >     >>     > > > the implementation can be migrated to NNVM or
> >> another
> >>     >     >> implementation in
> >>     >     >>     > > the
> >>     >     >>     > > > future, if at that point it seems like the right
> >> thing to
> >>     > do.
> >>     >     >>     > > >
> >>     >     >>     > > > Interested in hearing other opinions though…
> >>     >     >>     > > >
> >>     >     >>     > > > Hagay
> >>     >     >>     > > >
> >>     >     >>     > > > On 10/18/17, 14:13, "Tianqi Chen" <
> >> [email protected] on
> >>     >     >> behalf of
> >>     >     >>     > > > [email protected]> wrote:
> >>     >     >>     > > >
> >>     >     >>     > > >     I am strongly recommending going through the
> >>     > nnvm/top. One
> >>     >     >> major
> >>     >     >>     > > > reason in
> >>     >     >>     > > >     here is that the support of nnvm/top layer
> NOT
> >> ONLY
> >>     > mean
> >>     >     >>     > > compatibility
> >>     >     >>     > > > of
> >>     >     >>     > > >     model format with onnx. These are the major
> >> benefits:
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > > >     - More hardware backends to mxnet, including
> >> opencl,
> >>     > metal,
> >>     >     >>     > Raspberry
> >>     >     >>     > > > Pi,
> >>     >     >>     > > >     web browser. These things are automatically
> >> enabled
> >>     > by going
> >>     >     >>     > through
> >>     >     >>     > > > this
> >>     >     >>     > > >     layer. In general, we design nnvm/tvm stack
> to
> >>     > resolve the
> >>     >     >>     > challenge
> >>     >     >>     > > of
> >>     >     >>     > > >     current mxnet's weakness in terms deploying
> to
> >> more
> >>     > hardware
> >>     >     >>     > > backends.
> >>     >     >>     > > >
> >>     >     >>     > > >     - More frontend capabilities, nnvm's gluon
> >> style IR
> >>     > ingests
> >>     >     >> now
> >>     >     >>     > from
> >>     >     >>     > > >     CoreML, ONNX and in future keras. Supporting
> >> those
> >>     > will
> >>     >     >> reduce the
> >>     >     >>     > > > amount
> >>     >     >>     > > >     of engineering effort needed.
> >>     >     >>     > > >
> >>     >     >>     > > >     - Future compatibility. We all agree that the
> >> future
> >>     > being
> >>     >     >> migrated
> >>     >     >>     > > to
> >>     >     >>     > > >     gluon's API. NNVM/top tries to look ahead by
> >> directly
> >>     >     >> adopting the
> >>     >     >>     > > > symbolic
> >>     >     >>     > > >     API to be gluon.
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > > >     I would also like to correct some of the
> >> mentioned
> >>     > facts
> >>     >     >> with
> >>     >     >>     > regard
> >>     >     >>     > > to
> >>     >     >>     > > >     nnvm/tvm stack
> >>     >     >>     > > >
> >>     >     >>     > > >     1.   Nascent project with few contributors
> >>     >     >>     > > >
> >>     >     >>     > > >     NNVM Compiler now received contributions from
> >> AWS, UW
> >>     > and
> >>     >     >> many
> >>     >     >>     > other
> >>     >     >>     > > > folks
> >>     >     >>     > > >     in MXNet community. NNVM itself is already
> >> being used
> >>     > by
> >>     >     >> MXNet.
> >>     >     >>     > > >     MXNet's internal IR is migrating toward
> gluon,
> >> and its
> >>     >     >> final form
> >>     >     >>     > > being
> >>     >     >>     > > >     nnvm/top
> >>     >     >>     > > >
> >>     >     >>     > > >     3.   Does not support all operators that
> exist
> >> in
> >>     > MXNet
> >>     >     >> Symbolic
> >>     >     >>     > API
> >>     >     >>     > > >
> >>     >     >>     > > >     Neither NNVM/top or onnx support all
> operators
> >> that
> >>     > exist
> >>     >     >> in mxnet
> >>     >     >>     > > > symbolic
> >>     >     >>     > > >     API. The end goal here is mainly to make
> >> nnvm/top onnx
> >>     >     >> compatible,
> >>     >     >>     > > > which is
> >>     >     >>     > > >     a more reasonable goal.
> >>     >     >>     > > >
> >>     >     >>     > > >     4.  No CI Pipeline and testcases
> >>     >     >>     > > >
> >>     >     >>     > > >     NNVM already contains a compiler contains
> >> unittests
> >>     > and ci
> >>     >     >> tested
> >>     >     >>     > > with
> >>     >     >>     > > >     integration  https://github.com/dmlc/nnvm,
> >> with a CI
> >>     >     >> pipline that
> >>     >     >>     > is
> >>     >     >>     > > > well
> >>     >     >>     > > >     tested on CPU and GPU cases for front-ends.
> >>     >     >>     > > >
> >>     >     >>     > > >     Tianqi
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > > >     On Wed, Oct 18, 2017 at 1:41 PM, Roshani
> >> Nagmote <
> >>     >     >>     > > > [email protected]>
> >>     >     >>     > > >     wrote:
> >>     >     >>     > > >
> >>     >     >>     > > >     > Hi guys,
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > I am working on supporting ONNX <
> >>     >     >> https://github.com/onnx/onnx>
> >>     >     >>     > > > pre-trained
> >>     >     >>     > > >     > models in Apache MXNet and would like to
> >> seek your
> >>     >     >> opinion on the
> >>     >     >>     > > > choice of
> >>     >     >>     > > >     > implementation. I also have created a
> GitHub
> >> issue
> >>     >     >>     > > >     > <https://github.com/apache/
> >>     > incubator-mxnet/issues/8319>.
> >>     >     >>     > > Supporting
> >>     >     >>     > > > ONNX
> >>     >     >>     > > >     > in
> >>     >     >>     > > >     > MXNet will enable users to move between
> >> frameworks
> >>     > with
> >>     >     >> their
> >>     >     >>     > > > models, this
> >>     >     >>     > > >     > will also enable MXNet project to be a part
> >> of the
> >>     > ONNX
> >>     >     >> open
> >>     >     >>     > > > standard and
> >>     >     >>     > > >     > steer the direction of ONNX.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > For those who don’t know ONNX, ONNX is an
> >> open
> >>     > source
> >>     >     >> format for
> >>     >     >>     > AI
> >>     >     >>     > > > models
> >>     >     >>     > > >     > which enables models to be transferred
> >> between
> >>     >     >> frameworks. Refer
> >>     >     >>     > to
> >>     >     >>     > > >     > https://github.com/onnx/onnx for more
> >> details.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > To implement the import/export
> functionality
> >> in
> >>     > MXNet, I
> >>     >     >> propose
> >>     >     >>     > to
> >>     >     >>     > > > expose
> >>     >     >>     > > >     > a MXNet python module “serde”(name taken
> from
> >>     > Apache Hive
> >>     >     >>     > project)
> >>     >     >>     > > > with the
> >>     >     >>     > > >     > following methods supporting different
> >> formats:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > sym, params =
> mxnet.serde.import(other_forma
> >> t_file,
> >>     >     >>     > > > other_format=‘onnx’)
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > other_format_file =
> >> mxnet.serde.export(mxnet_sym,
> >>     >     >> mxnet_params,
> >>     >     >>     > > > ‘onnx’)
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > The implementation under the hood can be
> >> done in
> >>     > two ways:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > 1) Implement at the MXNet layer by parsing
> >> the ONNX
> >>     >     >> model(in
> >>     >     >>     > > protobuf
> >>     >     >>     > > >     > format) and turn into MXNet Symbolic
> >> operators and
> >>     > build
> >>     >     >> MXNet
> >>     >     >>     > > model
> >>     >     >>     > > >     > directly. Similarly, I can convert the
> MXNet
> >> model
> >>     > to
> >>     >     >> ONNX format
> >>     >     >>     > > at
> >>     >     >>     > > > this
> >>     >     >>     > > >     > layer.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > 2) The DMLC community has released the
> >> nnvm/tvm
> >>     > complier
> >>     >     >> and an
> >>     >     >>     > > >     > intermediate representation of the models,
> >> refer:
> >>     >     >>     > > >     > http://www.tvmlang.org/2017/
> >>     > 10/06/nnvm/tvm-compiler-
> >>     >     >>     > > > announcement.html
> >>     >     >>     > > >     > <http://www.tvmlang.org/2017/1
> >> 0/06/nnvm-compiler-
> >>     >     >>     > announcement.html
> >>     >     >>     > > >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Based on the conversation on the GitHub
> issue
> >>     >     >>     > > >     > <https://github.com/apache/
> >>     > incubator-mxnet/issues/8319> I
> >>     >     >>     > opened,
> >>     >     >>     > > Mu
> >>     >     >>     > > >     > mentioned that MXNet would use nnvm/tvm as
> >> the
> >>     > backend in
> >>     >     >> the
> >>     >     >>     > > future.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > We could hook into this layer to implement
> >> the
> >>     >     >> import/export
> >>     >     >>     > > > functionality.
> >>     >     >>     > > >     > nnvm/tvm has ONNX 0.1 version import
> >> implemented.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > For import,
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    I will need to enhance nnvm/tvm’s
> >> importer to
> >>     > support
> >>     >     >> ONNX 0.2
> >>     >     >>     > > >     >    2.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Implement nnvm/tvm->mxnet symbolic
> >> operators.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > For export:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    mxnet->nnvm/tvm ( nnvm/tvm provides this
> >>     > implementation
> >>     >     >>     > already)
> >>     >     >>     > > >     >    2.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    I will need to Implement nnvm/tvm>onnx.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > These are the pros and cons I see in the
> >> above
> >>     > approaches:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Import/export at mxnet layer
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Pros:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Stable APIs currently used by users.
> >>     >     >>     > > >     >    2.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Larger Apache MXNet community of
> >> contributors.
> >>     >     >>     > > >     >    3.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    CI pipeline to catch bugs.
> >>     >     >>     > > >     >    4.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Comparatively less time to implement and
> >> put it
> >>     > in the
> >>     >     >> hands
> >>     >     >>     > of
> >>     >     >>     > > > the
> >>     >     >>     > > >     >    users.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Cons:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    In the future we may have to reimplement
> >> at the
> >>     >     >> nnvm/tvm
> >>     >     >>     > layer,
> >>     >     >>     > > > in case
> >>     >     >>     > > >     >    MXNet moves to the nnvm/tvm
> >> backend(assuming it
> >>     > will
> >>     >     >> move).
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Import/export at nnvm/tvm layer
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Pros:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Less engineering work in case mxnet
> moves
> >> to
> >>     > nnvm/tvm
> >>     >     >>     > > >     >    2.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    nnvm/tvm would become a hub to convert
> to
> >>     > different
> >>     >     >> formats.
> >>     >     >>     > > >     >    3.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    nnvm operators are more in parity with
> >> mxnet’s
> >>     > gluon
> >>     >     >> APIs this
> >>     >     >>     > > > could be
> >>     >     >>     > > >     >    useful in case Gluon becomes the only
> >> standard
> >>     > that
> >>     >     >> MXNet will
> >>     >     >>     > > > support.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Cons:
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    1.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Nascent project with few contributors
> >>     >     >>     > > >     >    2.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Does not support all operators that
> exist
> >> in
> >>     > MXNet
> >>     >     >> Symbolic
> >>     >     >>     > API
> >>     >     >>     > > >     >    3.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    No CI Pipeline
> >>     >     >>     > > >     >    4.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    Current Apache MXNet project does not
> use
> >>     > nnvm/tvm
> >>     >     >> backend
> >>     >     >>     > > >     >    5.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >    mxnet->nnvm/tvm backend needs more
> >> testing and
> >>     > user
> >>     >     >> feedback.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Any suggestions on both of these
> approaches?
> >> From
> >>     > user's
> >>     >     >>     > > > perspective, this
> >>     >     >>     > > >     > will be an implementation detail that is
> not
> >>     > exposed.
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Thanks,
> >>     >     >>     > > >     >
> >>     >     >>     > > >     > Roshani
> >>     >     >>     > > >     >
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > > >
> >>     >     >>     > >
> >>     >     >>     >
> >>     >     >>
> >>     >     >>
> >>     >     >>
> >>     >     >>
> >>     >     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>
> >>
> >>
> >>
> >
>

Re: Request for suggestions- Supporting onnx in mxnet

Reply via email to