Re: Request for suggestions- Supporting onnx in mxnet

Tianqi Chen Wed, 18 Oct 2017 22:03:49 -0700

OK, there is some miscommunication in here I guess.  We only need to do a
"canonization" step in python API that goes a symbol to symbol translation
layer. It can be done in purely in python, and there is no need for going
"down" into c++ to do this.


For example, the current nnvm.from_mxnet API takes Module or Gluon module
and get you back nnvm/top graph in python.

All we are asking for is to decomposing it into

def mxnet_to_onnx(module):
   nnvm_graph, params = nnvm_from_mxnet(module)
   onnx = nnvm_to_onnx(nnvm_graph, params)
   return onnx

This allows nnvm_from_mxnet to be reused for other purposes, like compiling
API to deployable modules

Tianqi

On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay <[email protected]> wrote:

> Tianqi:
> Thanks for detailing the trends. I fully agree that ONNX is just a graph
> serialization format – nothing more, nothing less. I also think we all
> agree that this simple mechanism holds lots of value to DL users since it
> allows them to move between frameworks easily (e.g. train with MXNet,
> deploy on a mobile device with Caffe2, or the other way around).
> As you said, In Memory IR is different than serialization formats such as
> ONNX. They are designed to make the runtime execution as efficient as
> possible, leveraging software and hardware optimizations. They are indeed
> complex, and where the “meat” is.
> (BTW ONNX regards itself as an “IR” format, but not in the same sense as
> NNVM).
>
> At the end of the day, Roshani is aiming to deliver a simple functionality
> to MXNet users: (1) take an ONNX file, and load it into MXNet so you get a
> graph+weights you can work with (2) Given a trained model, save it as an
> ONNX file. Since MXNet users do not interact with NNVM directly, but rather
> interact with MXNet API (MXNet Module), isn’t the simplest thing to do is
> just to construct the Module “on the fly” using MXNet API? Taking the other
> approach, we will go from the top level MXNet “load” API, go “down” to NNVM
> to construct the graph, go back up to MXNet to expose it as a Module. This
> seems to complex and does not add any benefit. In whatever way we construct
> the MXNet Module object, NNVM will always be the underlying in memory IR
> that is being executed, so why not take the simpler route?
>
> Hagay
>
> On 10/18/17, 19:42, "Tianqi Chen" <[email protected] on behalf of
> [email protected]> wrote:
>
>     Hi Chris:
>
>     There is no intention to move things away from mxnet. The reduction of
>     lines of code by having a better design in general, and usually, you
> write
>     less redundant code by benefiting from better design. As I may quote:
> "the
>     best design is not achieved not when there is nothing to add, but when
>     there is nothing to be taken away."
>
>     MXNet has always benefited from this philosophy and improves with the
> new
>     designs and proper modularization. For example, we see such reduction
> and
>     convenience happening when migrating from MXNet's legacy op to the
>     NNVM's mechanism. The new mechanism now enables things like sparse
> aware
>     support and other stuff which would be much harder to support.
>
>     The nnvm/tvm stack comes brings the same benefit(if not more) and it
> will
>     only add more features to MXNet itself. Offering more hardware
> backends and
>     optimization, allowing us to write less code and spent less time to
>     optimize for each backend by going through TVM
>
>     Tianqi
>
>     On Wed, Oct 18, 2017 at 7:15 PM, Chris Olivier <[email protected]>
>     wrote:
>
>     > Reduce code base of mxnet? By increasing scope of the dmlc modules?
> Is the
>     > intent to make mxnet a thin language wrapper around a group of dmlc
>     > modules?
>     >
>     >
>     > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen <
> [email protected]>
>     > wrote:
>     >
>     > > To better answer Hagay's question, I would like to dive down a bit
> deeper
>     > > on the relation between MXNet, NNVM and model exchange format like
> ONNX.
>     > >
>     > > There are two major trends in deep learning systems now:
>     > >
>     > > - Common serializable formats, like ONNX and CoreML, that defines
> the
>     > model
>     > > exchange format.
>     > > - The in-memory graph IR for quick optimization and JIT. NNVM,
>     > Tensorflow's
>     > > XLA falls into this category.
>     > >
>     > > The exchange formats are great, it only poses a layer of
> conversion,
>     > which
>     > > is good for exchange. The real meat still comes from the
> compilation and
>     > > JIT pipeline you have to offer. For that, we will need an
> in-memory IR,
>     > > because of the cost of constructing, serialize could be high for
> the
>     > > exchange formats like protobuf.  And usually, the exchange formats
> are
>     > > designed in a minimalistic fashion, making it less easy to extend
> more
>     > > information to support in-depth optimization like automatic
> quantization,
>     > > accelerator support.
>     > >
>     > > The current MXNet relies on NNVM for in-memory IR manipulation but
> does
>     > not
>     > > contain a compilation component that compiles to the hardware
> backends.
>     > > Doing export to an exchange format and then back into NNVM run the
>     > > compilation poses too much burden that JIT compiler could pay.
> Using the
>     > > same in-memory graph IR as the compilation stack give much more
> advantage
>     > > in terms of this.
>     > >
>     > > The newly introduces nnvm/top and compiler offers in-memory graph
>     > > optimization and compilation and offers more hardware backend
> directly
>     > via
>     > > TVM. We already see promising results in edge deployments with a
> much
>     > lower
>     > > overhead of runtime. We will further benefit quickly from more
> graph
>     > > optimizations that it has to offer.
>     > >
>     > > Building support around this new paradigm offers us advantage of
> being
>     > > future compatible and takes full benefit of the points I mentioned
> above
>     > >
>     > > Tianqi
>     > >
>     > >
>     > >
>     > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko, Hagay <[email protected]
> >
>     > wrote:
>     > >
>     > > > Roshani – this is an exciting initiative, ONNX support on MXNet
> will
>     > > > enable more users to ramp up on MXNet, which is great.
>     > > >
>     > > > Tianqi – a few questions and thoughts about your note:
>     > > > - “More hardware backends to mxnet” – MXNet users get the same
> benefit
>     > of
>     > > > HW support implementing ONNX import on top of MXNet symbolic,
> right?
>     > > > - “NNVM Compiler now received contributions from AWS, UW and
> many other
>     > > > folks in MXNet community.” – agreed it is ramping up, but when
> you look
>     > > at
>     > > > the data, it is clear that it is very early on for NNVM. Looking
> at the
>     > > > repo, it has overall 223 commits, 0 releases. Compare it to
> MXNet with
>     > > 6136
>     > > > commits and 32 releases. It seems to be still early on for NNVM,
> and
>     > for
>     > > a
>     > > > more reliable initial implementation building the import on top
> of
>     > MXNet
>     > > is
>     > > > easier, faster and safer. MXNet has lots of users already using
> the
>     > > > Symbolic API which hopefully mean that is a mature API that is
> not
>     > likely
>     > > > to have breaking changes or major issues.
>     > > >
>     > > > I’m supportive option 1 proposed by Roshani (building serde on
> top of
>     > > > MXNet symbolic), but to do it as an encapsulated implementation
> detail,
>     > > so
>     > > > the implementation can be migrated to NNVM or another
> implementation in
>     > > the
>     > > > future, if at that point it seems like the right thing to do.
>     > > >
>     > > > Interested in hearing other opinions though…
>     > > >
>     > > > Hagay
>     > > >
>     > > > On 10/18/17, 14:13, "Tianqi Chen" <[email protected] on behalf
> of
>     > > > [email protected]> wrote:
>     > > >
>     > > >     I am strongly recommending going through the nnvm/top. One
> major
>     > > > reason in
>     > > >     here is that the support of nnvm/top layer NOT ONLY mean
>     > > compatibility
>     > > > of
>     > > >     model format with onnx. These are the major benefits:
>     > > >
>     > > >
>     > > >     - More hardware backends to mxnet, including opencl, metal,
>     > Raspberry
>     > > > Pi,
>     > > >     web browser. These things are automatically enabled by going
>     > through
>     > > > this
>     > > >     layer. In general, we design nnvm/tvm stack to resolve the
>     > challenge
>     > > of
>     > > >     current mxnet's weakness in terms deploying to more hardware
>     > > backends.
>     > > >
>     > > >     - More frontend capabilities, nnvm's gluon style IR ingests
> now
>     > from
>     > > >     CoreML, ONNX and in future keras. Supporting those will
> reduce the
>     > > > amount
>     > > >     of engineering effort needed.
>     > > >
>     > > >     - Future compatibility. We all agree that the future being
> migrated
>     > > to
>     > > >     gluon's API. NNVM/top tries to look ahead by directly
> adopting the
>     > > > symbolic
>     > > >     API to be gluon.
>     > > >
>     > > >
>     > > >     I would also like to correct some of the mentioned facts with
>     > regard
>     > > to
>     > > >     nnvm/tvm stack
>     > > >
>     > > >     1.   Nascent project with few contributors
>     > > >
>     > > >     NNVM Compiler now received contributions from AWS, UW and
> many
>     > other
>     > > > folks
>     > > >     in MXNet community. NNVM itself is already being used by
> MXNet.
>     > > >     MXNet's internal IR is migrating toward gluon, and its final
> form
>     > > being
>     > > >     nnvm/top
>     > > >
>     > > >     3.   Does not support all operators that exist in MXNet
> Symbolic
>     > API
>     > > >
>     > > >     Neither NNVM/top or onnx support all operators that exist in
> mxnet
>     > > > symbolic
>     > > >     API. The end goal here is mainly to make nnvm/top onnx
> compatible,
>     > > > which is
>     > > >     a more reasonable goal.
>     > > >
>     > > >     4.  No CI Pipeline and testcases
>     > > >
>     > > >     NNVM already contains a compiler contains unittests and ci
> tested
>     > > with
>     > > >     integration  https://github.com/dmlc/nnvm, with a CI
> pipline that
>     > is
>     > > > well
>     > > >     tested on CPU and GPU cases for front-ends.
>     > > >
>     > > >     Tianqi
>     > > >
>     > > >
>     > > >     On Wed, Oct 18, 2017 at 1:41 PM, Roshani Nagmote <
>     > > > [email protected]>
>     > > >     wrote:
>     > > >
>     > > >     > Hi guys,
>     > > >     >
>     > > >     >
>     > > >     > I am working on supporting ONNX <
> https://github.com/onnx/onnx>
>     > > > pre-trained
>     > > >     > models in Apache MXNet and would like to seek your opinion
> on the
>     > > > choice of
>     > > >     > implementation. I also have created a GitHub issue
>     > > >     > <https://github.com/apache/incubator-mxnet/issues/8319>.
>     > > Supporting
>     > > > ONNX
>     > > >     > in
>     > > >     > MXNet will enable users to move between frameworks with
> their
>     > > > models, this
>     > > >     > will also enable MXNet project to be a part of the ONNX
> open
>     > > > standard and
>     > > >     > steer the direction of ONNX.
>     > > >     >
>     > > >     >
>     > > >     > For those who don’t know ONNX, ONNX is an open source
> format for
>     > AI
>     > > > models
>     > > >     > which enables models to be transferred between frameworks.
> Refer
>     > to
>     > > >     > https://github.com/onnx/onnx for more details.
>     > > >     >
>     > > >     >
>     > > >     > To implement the import/export functionality in MXNet, I
> propose
>     > to
>     > > > expose
>     > > >     > a MXNet python module “serde”(name taken from Apache Hive
>     > project)
>     > > > with the
>     > > >     > following methods supporting different formats:
>     > > >     >
>     > > >     > sym, params = mxnet.serde.import(other_format_file,
>     > > > other_format=‘onnx’)
>     > > >     >
>     > > >     > other_format_file =  mxnet.serde.export(mxnet_sym,
> mxnet_params,
>     > > > ‘onnx’)
>     > > >     >
>     > > >     >
>     > > >     > The implementation under the hood can be done in two ways:
>     > > >     >
>     > > >     >
>     > > >     > 1) Implement at the MXNet layer by parsing the ONNX
> model(in
>     > > protobuf
>     > > >     > format) and turn into MXNet Symbolic operators and build
> MXNet
>     > > model
>     > > >     > directly. Similarly, I can convert the MXNet model to ONNX
> format
>     > > at
>     > > > this
>     > > >     > layer.
>     > > >     >
>     > > >     >
>     > > >     > 2) The DMLC community has released the nnvm/tvm complier
> and an
>     > > >     > intermediate representation of the models, refer:
>     > > >     > http://www.tvmlang.org/2017/10/06/nnvm/tvm-compiler-
>     > > > announcement.html
>     > > >     > <http://www.tvmlang.org/2017/10/06/nnvm-compiler-
>     > announcement.html
>     > > >
>     > > >     >
>     > > >     > Based on the conversation on the GitHub issue
>     > > >     > <https://github.com/apache/incubator-mxnet/issues/8319> I
>     > opened,
>     > > Mu
>     > > >     > mentioned that MXNet would use nnvm/tvm as the backend in
> the
>     > > future.
>     > > >     >
>     > > >     >
>     > > >     > We could hook into this layer to implement the
> import/export
>     > > > functionality.
>     > > >     > nnvm/tvm has ONNX 0.1 version import implemented.
>     > > >     >
>     > > >     > For import,
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    I will need to enhance nnvm/tvm’s importer to support
> ONNX 0.2
>     > > >     >    2.
>     > > >     >
>     > > >     >    Implement nnvm/tvm->mxnet symbolic operators.
>     > > >     >
>     > > >     > For export:
>     > > >     >
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    mxnet->nnvm/tvm ( nnvm/tvm provides this implementation
>     > already)
>     > > >     >    2.
>     > > >     >
>     > > >     >    I will need to Implement nnvm/tvm>onnx.
>     > > >     >
>     > > >     >
>     > > >     > These are the pros and cons I see in the above approaches:
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    Import/export at mxnet layer
>     > > >     >
>     > > >     > Pros:
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    Stable APIs currently used by users.
>     > > >     >    2.
>     > > >     >
>     > > >     >    Larger Apache MXNet community of contributors.
>     > > >     >    3.
>     > > >     >
>     > > >     >    CI pipeline to catch bugs.
>     > > >     >    4.
>     > > >     >
>     > > >     >    Comparatively less time to implement and put it in the
> hands
>     > of
>     > > > the
>     > > >     >    users.
>     > > >     >
>     > > >     > Cons:
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    In the future we may have to reimplement at the nnvm/tvm
>     > layer,
>     > > > in case
>     > > >     >    MXNet moves to the nnvm/tvm backend(assuming it will
> move).
>     > > >     >
>     > > >     >
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    Import/export at nnvm/tvm layer
>     > > >     >
>     > > >     > Pros:
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    Less engineering work in case mxnet moves to nnvm/tvm
>     > > >     >    2.
>     > > >     >
>     > > >     >    nnvm/tvm would become a hub to convert to different
> formats.
>     > > >     >    3.
>     > > >     >
>     > > >     >    nnvm operators are more in parity with mxnet’s gluon
> APIs this
>     > > > could be
>     > > >     >    useful in case Gluon becomes the only standard that
> MXNet will
>     > > > support.
>     > > >     >
>     > > >     > Cons:
>     > > >     >
>     > > >     >    1.
>     > > >     >
>     > > >     >    Nascent project with few contributors
>     > > >     >    2.
>     > > >     >
>     > > >     >    Does not support all operators that exist in MXNet
> Symbolic
>     > API
>     > > >     >    3.
>     > > >     >
>     > > >     >    No CI Pipeline
>     > > >     >    4.
>     > > >     >
>     > > >     >    Current Apache MXNet project does not use nnvm/tvm
> backend
>     > > >     >    5.
>     > > >     >
>     > > >     >    mxnet->nnvm/tvm backend needs more testing and user
> feedback.
>     > > >     >
>     > > >     >
>     > > >     > Any suggestions on both of these approaches? From user's
>     > > > perspective, this
>     > > >     > will be an implementation detail that is not exposed.
>     > > >     >
>     > > >     > Thanks,
>     > > >     >
>     > > >     > Roshani
>     > > >     >
>     > > >
>     > > >
>     > > >
>     > > >
>     > >
>     >
>
>
>
>

Re: Request for suggestions- Supporting onnx in mxnet

Reply via email to