RE: Design proposal - MXNet end to end models - Models with data transformations

2019-01-16 Thread Zhao, Patric
+1 for this great proposal. 

MXNet will be more flexible and portable with this new feature :)

Thanks,

--Patric


> -Original Message-
> From: sandeep krishnamurthy [mailto:sandeep.krishn...@gmail.com]
> Sent: Thursday, January 17, 2019 8:47 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Design proposal - MXNet end to end models - Models with data
> transformations
> 
> Hello Community,
> 
> Me along with fellow MXNet contributors (Jake
> <https://github.com/stu1130>, Karan <https://github.com/karan6181>) are
> working on the following problem:
> 1. Some of the data transformations used in training is applicable during
> inference. Most commonly transformations on validation data is same as
> transformations required during inference.
> 2. MXNet models do not contain data transformations as part of the graph.
> Making it harder, time consuming and duplicated effort to re create data
> transformation during inference. This problem is more evident in cross
> language use cases. Training in Gluon (Python) and inference in Java/C++.
> 
> After few initial discussions with some of MXNet contributors (Zhi
> <https://github.com/zhreshold>, Naveen <https://github.com/nswamy>,
> Sina <https://github.com/safrooze>), design proposal, development plan,
> tasks, milestones and more details are captured in this document.
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+end+to+end+
> models
> 
> Please do provide your feedback via comments in the document or on this e-
> mail. All contributions are welcome. I will be creating JIRA stories and 
> issues
> for initial tasks identified.
> 
> --
> Sandeep Krishnamurthy


Re: Design proposal - MXNet end to end models - Models with data transformations

2019-01-16 Thread Sheng Zha
Hi Sandeep,

Thanks for taking the initiative and sharing the proposal. It's great to
see the image operators being extended.

To summarize, the design for the first phase provides two alternatives:
  - D1: Use the existing approach of expressing data transformation
pipeline as hybrid block, and use operators for transformations to achieve
portability. Extend operators for performance and usability.
  - D2 (called alternative approach 1 in the proposal): Extend the model
export API to express the concept of auxiliary graphs in the same json
symbol file.

First on D2, the proposed addition of auxiliary graph seems neither
sufficient on itself, nor necessary. This is because this additional field
relies on operators and symbolic interface of mxnet. If one can use
HybridBlock to express the data preprocessing logic, this HybridBlock can
already, without the addition of the field, be easily exported and then
imported as a separate symbol from the model symbol, and used in other
language bindings for data preprocessing. On the other hand, if the logic
cannot be expressed as HybridBlock, then you still wouldn't be able to put
that in the auxiliary graph field anyway.

For D1, extending the image operators and rely on them for portability is
definitely the right direction and the shortest path. Since this approach
comes from GluonCV and is already available as part of the export helper
[1], there's nothing new to review on the approach.

On the specific PRs listed as part of D1:
- It is great to see that stu1130@ is implementing resize, center_crop, and
crop operators from scratch [2][3][4]. These features have long been
desired, kudos!
- It is very nice to see the GPU and batch support being added in to_tensor
and normalize operator PRs [5][6] that sandeep-krishnamurthy@ is working
on. Some minor issues:
  - These PRs seem to assume that these are not yet operators. But they
certainly are.
  - As a result of this assumption, they move the existing code to new
files. Generally we should minimize such no-op change as it causes trouble
in viewing the edit history, and do so only when absolutely necessary.
(and call for review to the community: if you love CV, help on these PRs is
much appreciated)

Finally, I have a suggestion on the review request. This proposal lists a
plan of four phases while only providing designs for the first one. In this
case, unless you have solutions to address them for the community to
review, the wishful future phases may be better suited for a separate
roadmap discussion. As I spent quite some time going through the proposal
but found little new approach to review, I'd suggest not mixing them in a
design proposal or review request next time.

Hope it helps.

-sz

[1]
https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/export_helper.py
[2] https://github.com/apache/incubator-mxnet/pull/13611/files
[3] https://github.com/apache/incubator-mxnet/pull/13694/files
[4] https://github.com/apache/incubator-mxnet/pull/13679/files
[5] https://github.com/apache/incubator-mxnet/pull/13837/files
[6] https://github.com/apache/incubator-mxnet/pull/13802/files


On Wed, Jan 16, 2019 at 4:47 PM sandeep krishnamurthy <
sandeep.krishn...@gmail.com> wrote:

> Hello Community,
>
> Me along with fellow MXNet contributors (Jake  >,
> Karan ) are working on the following
> problem:
> 1. Some of the data transformations used in training is applicable during
> inference. Most commonly transformations on validation data is same as
> transformations required during inference.
> 2. MXNet models do not contain data transformations as part of the graph.
> Making it harder, time consuming and duplicated effort to re create data
> transformation during inference. This problem is more evident in cross
> language use cases. Training in Gluon (Python) and inference in Java/C++.
>
> After few initial discussions with some of MXNet contributors (Zhi
> , Naveen , Sina
> ), design proposal, development plan, tasks,
> milestones and more details are captured in this document.
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+end+to+end+models
>
> Please do provide your feedback via comments in the document or on this
> e-mail. All contributions are welcome. I will be creating JIRA stories and
> issues for initial tasks identified.
>
> --
> Sandeep Krishnamurthy
>


Design proposal - MXNet end to end models - Models with data transformations

2019-01-16 Thread sandeep krishnamurthy
Hello Community,

Me along with fellow MXNet contributors (Jake ,
Karan ) are working on the following problem:
1. Some of the data transformations used in training is applicable during
inference. Most commonly transformations on validation data is same as
transformations required during inference.
2. MXNet models do not contain data transformations as part of the graph.
Making it harder, time consuming and duplicated effort to re create data
transformation during inference. This problem is more evident in cross
language use cases. Training in Gluon (Python) and inference in Java/C++.

After few initial discussions with some of MXNet contributors (Zhi
, Naveen , Sina
), design proposal, development plan, tasks,
milestones and more details are captured in this document.
https://cwiki.apache.org/confluence/display/MXNET/MXNet+end+to+end+models

Please do provide your feedback via comments in the document or on this
e-mail. All contributions are welcome. I will be creating JIRA stories and
issues for initial tasks identified.

-- 
Sandeep Krishnamurthy