Hi Yan Zhe, I am very excited about Cambricon’s proposal to integrate with MXNet. The proposal is quite comprehensive, but one piece that I find missing is the graph partitioning piece. In your proposal you mention that CNML may not support all MXNet operators, and so some parts may run on the host device. Are you planning to do something similar to how TensorRT integrated with MXNet? They also have to do a “compile” and convert parts of the graph to their own format for execution. Here is the PR with their recent changes supporting the subgraph API:
https://github.com/apache/incubator-mxnet/pull/14040 You can take a look at how they did it to get a better idea if the same can work for you. I think the level of integration you propose is quite extensive and with touch all the underlying components of MXNet (and its dependencies in TVM/NNVM). So I’m not sure if thats the right approach, since it will take a lot of time to make all the changes and test that nothing has broken. Intel’s changes for MKLDNN touched quite a few pieces but not as much as you’re proposing. I would like to see a retrospective from Intel at some point on how they think their integration went and if we can make it better for the next time (ie. Cambricon). Thanks Tao for the Accelerator proposal plug! While I would like to see Cambricon’s accelerator use that work, I don’t want to hold up their integration to MXNet. We are just getting started with this work. Our next milestone is to put up an initial WIP PR and discuss with the community, so I’m not sure that that is the preferred approach (since it hasn’t been accepted by the community yet, either). Sam On Jul 4, 2019, at 12:55 AM, 严哲 <[email protected]<mailto:[email protected]>> wrote: Hi Tao, Thanks for your suggestions. I have read the proposal "Bring your own Accelerator". The feature it proposed is excellent. After reading the proposal, I have two questions: 1. If our current design shown in the proposal "Design of CNML/CNRT Integration" is feasible ? 2. If the way proposed in "Bring your own Accelerator" is preferred, we will refactor our design toward that way. In this case, I want to know if the feature in "Bring your own Accelerator" is approved and in the plan ? -----原始邮件----- 发件人: "Lv, Tao A" <[email protected]<mailto:[email protected]>> 发送时间: 2019-06-28 17:46:22 (星期五) 收件人: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> 抄送: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> 主题: RE: Proposal - Design of CNML/CNRT Integration Hi Yan Zhe, Thanks for the nice proposal. In case you didn't know, there is a meta proposal for bringing new accelerator to MXNet: https://cwiki.apache.org/confluence/display/MXNET/Bring+your+own+Accelerator Before reading in details, I'm curious to know whether you have any performance data for the proposal and what's the validation plan for a new hardware backend? Thanks, -tao -----Original Message----- From: 严哲 [mailto:[email protected]] Sent: Friday, June 28, 2019 4:42 PM To: [email protected]<mailto:[email protected]> Subject: Proposal - Design of CNML/CNRT Integration Hello Community, I am from Cambricon which is a company developing machine learning processors. Now I have written a proposal that introduces the design of integrating CNML(Cambricon Neuware Machine Learning Library) / CNRT(Cambricon Neuware Runtime Library) into MXNet. After integrating with CNML/CNRT, developers can work with MXNet on Cambricon machine learning processors. I am looking forward to your precious suggestions, especially the suggestions considering recent updates about MXNet version 1.5.0. Any feedback and help will be greatly appreciated. Design proposal: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120722127 Thanks & BestRegards, YanZhe ------------------------------ Thanks & Best Regards, Yan Zhe
