Hi Sandeep, Thanks for taking the initiative and sharing the proposal. It's great to see the image operators being extended.
To summarize, the design for the first phase provides two alternatives: - D1: Use the existing approach of expressing data transformation pipeline as hybrid block, and use operators for transformations to achieve portability. Extend operators for performance and usability. - D2 (called alternative approach 1 in the proposal): Extend the model export API to express the concept of auxiliary graphs in the same json symbol file. First on D2, the proposed addition of auxiliary graph seems neither sufficient on itself, nor necessary. This is because this additional field relies on operators and symbolic interface of mxnet. If one can use HybridBlock to express the data preprocessing logic, this HybridBlock can already, without the addition of the field, be easily exported and then imported as a separate symbol from the model symbol, and used in other language bindings for data preprocessing. On the other hand, if the logic cannot be expressed as HybridBlock, then you still wouldn't be able to put that in the auxiliary graph field anyway. For D1, extending the image operators and rely on them for portability is definitely the right direction and the shortest path. Since this approach comes from GluonCV and is already available as part of the export helper [1], there's nothing new to review on the approach. On the specific PRs listed as part of D1: - It is great to see that stu1130@ is implementing resize, center_crop, and crop operators from scratch [2][3][4]. These features have long been desired, kudos! - It is very nice to see the GPU and batch support being added in to_tensor and normalize operator PRs [5][6] that sandeep-krishnamurthy@ is working on. Some minor issues: - These PRs seem to assume that these are not yet operators. But they certainly are. - As a result of this assumption, they move the existing code to new files. Generally we should minimize such no-op change as it causes trouble in viewing the edit history, and do so only when absolutely necessary. (and call for review to the community: if you love CV, help on these PRs is much appreciated) Finally, I have a suggestion on the review request. This proposal lists a plan of four phases while only providing designs for the first one. In this case, unless you have solutions to address them for the community to review, the wishful future phases may be better suited for a separate roadmap discussion. As I spent quite some time going through the proposal but found little new approach to review, I'd suggest not mixing them in a design proposal or review request next time. Hope it helps. -sz [1] https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/export_helper.py [2] https://github.com/apache/incubator-mxnet/pull/13611/files [3] https://github.com/apache/incubator-mxnet/pull/13694/files [4] https://github.com/apache/incubator-mxnet/pull/13679/files [5] https://github.com/apache/incubator-mxnet/pull/13837/files [6] https://github.com/apache/incubator-mxnet/pull/13802/files On Wed, Jan 16, 2019 at 4:47 PM sandeep krishnamurthy < sandeep.krishn...@gmail.com> wrote: > Hello Community, > > Me along with fellow MXNet contributors (Jake <https://github.com/stu1130 > >, > Karan <https://github.com/karan6181>) are working on the following > problem: > 1. Some of the data transformations used in training is applicable during > inference. Most commonly transformations on validation data is same as > transformations required during inference. > 2. MXNet models do not contain data transformations as part of the graph. > Making it harder, time consuming and duplicated effort to re create data > transformation during inference. This problem is more evident in cross > language use cases. Training in Gluon (Python) and inference in Java/C++. > > After few initial discussions with some of MXNet contributors (Zhi > <https://github.com/zhreshold>, Naveen <https://github.com/nswamy>, Sina > <https://github.com/safrooze>), design proposal, development plan, tasks, > milestones and more details are captured in this document. > https://cwiki.apache.org/confluence/display/MXNET/MXNet+end+to+end+models > > Please do provide your feedback via comments in the document or on this > e-mail. All contributions are welcome. I will be creating JIRA stories and > issues for initial tasks identified. > > -- > Sandeep Krishnamurthy >