Re: [Announcement] New Committer - Hao Jin
Congrats Hao! On Tue, Apr 30, 2019 at 10:53 PM Steffen Rochel wrote: > congratulation Hao! > > On Tue, Apr 30, 2019 at 8:05 AM MiraiWK WKCN wrote: > > > Congrats Hao! Welcome! > > > > > > From: Lv, Tao A > > Sent: Tuesday, April 30, 2019 11:00:33 PM > > To: dev@mxnet.incubator.apache.org > > Subject: RE: [Announcement] New Committer - Hao Jin > > > > Congratulations Hao! > > > > -Original Message- > > From: Jun Wu [mailto:wujun@gmail.com] > > Sent: Tuesday, April 30, 2019 12:29 PM > > To: dev@mxnet.incubator.apache.org > > Subject: [Announcement] New Committer - Hao Jin > > > > Please join me in welcoming Hao Jin (https://github.com/haojin2) from > AWS > > as a new committer. > > > > Hao has designed and implemented many sophisticated algorithms for tensor > > operations. His work has greatly expanded the coverage of MXNet operator > > inventory and enhanced the performance of many operators that are hard to > > be optimized. Not only that, Hao has been active in advocating MXNet > > through providing high-quality translation service for quite a few > > technical articles and blog posts. > > >
Re: [Announcement] New Committer - Hao Jin
congratulation Hao! On Tue, Apr 30, 2019 at 8:05 AM MiraiWK WKCN wrote: > Congrats Hao! Welcome! > > > From: Lv, Tao A > Sent: Tuesday, April 30, 2019 11:00:33 PM > To: dev@mxnet.incubator.apache.org > Subject: RE: [Announcement] New Committer - Hao Jin > > Congratulations Hao! > > -Original Message- > From: Jun Wu [mailto:wujun@gmail.com] > Sent: Tuesday, April 30, 2019 12:29 PM > To: dev@mxnet.incubator.apache.org > Subject: [Announcement] New Committer - Hao Jin > > Please join me in welcoming Hao Jin (https://github.com/haojin2) from AWS > as a new committer. > > Hao has designed and implemented many sophisticated algorithms for tensor > operations. His work has greatly expanded the coverage of MXNet operator > inventory and enhanced the performance of many operators that are hard to > be optimized. Not only that, Hao has been active in advocating MXNet > through providing high-quality translation service for quite a few > technical articles and blog posts. >
Podling Report Reminder - May 2019
Dear podling, This email was sent by an automated system on behalf of the Apache Incubator PMC. It is an initial reminder to give you plenty of time to prepare your quarterly board report. The board meeting is scheduled for Wed, 15 May 2019, 10:30 am PDT. The report for your podling will form a part of the Incubator PMC report. The Incubator PMC requires your report to be submitted 2 weeks before the board meeting, to allow sufficient time for review and submission (Wed, May 01). Please submit your report with sufficient time to allow the Incubator PMC, and subsequently board members to review and digest. Again, the very latest you should submit your report is 2 weeks prior to the board meeting. Candidate names should not be made public before people are actually elected, so please do not include the names of potential committers or PPMC members in your report. Thanks, The Apache Incubator PMC Submitting your Report -- Your report should contain the following: * Your project name * A brief description of your project, which assumes no knowledge of the project or necessarily of its field * A list of the three most important issues to address in the move towards graduation. * Any issues that the Incubator PMC or ASF Board might wish/need to be aware of * How has the community developed since the last report * How has the project developed since the last report. * How does the podling rate their own maturity. This should be appended to the Incubator Wiki page at: https://cwiki.apache.org/confluence/INCUBATOR/May2019 Note: This is manually populated. You may need to wait a little before this page is created from a template. Mentors --- Mentors should review reports for their project(s) and sign them off on the Incubator wiki page. Signing off reports shows that you are following the project - projects that are not signed may raise alarms for the Incubator PMC. Incubator PMC
Re: [Announcement] New Committer - Zhennan Qin
Congrats, Zhennan! Well deserved. Lin On Tue, Apr 30, 2019 at 3:07 PM Zhao, Patric wrote: > Cong, Zhennan. > > Really great works and it makes the MXNet/Quantization flow outstanding > over the world! > > > -Original Message- > > From: Lv, Tao A [mailto:tao.a...@intel.com] > > Sent: Tuesday, April 30, 2019 11:01 PM > > To: dev@mxnet.incubator.apache.org > > Subject: RE: [Announcement] New Committer - Zhennan Qin > > > > Congratulations Zhennan! > > > > -Original Message- > > From: Jun Wu [mailto:wujun@gmail.com] > > Sent: Tuesday, April 30, 2019 12:29 PM > > To: dev@mxnet.incubator.apache.org > > Subject: [Announcement] New Committer - Zhennan Qin > > > > Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin) > > from Intel as a new committer. > > > > Zhennan is the main author of accelerating MXNet/MKLDNN inference > > through operator fusion and model quantization. His work has placed MXNet > > in an advantageous place for inference workloads on Intel CPUs compared > > with other DL frameworks. >
RE: [Announcement] New Committer - Zhennan Qin
Cong, Zhennan. Really great works and it makes the MXNet/Quantization flow outstanding over the world! > -Original Message- > From: Lv, Tao A [mailto:tao.a...@intel.com] > Sent: Tuesday, April 30, 2019 11:01 PM > To: dev@mxnet.incubator.apache.org > Subject: RE: [Announcement] New Committer - Zhennan Qin > > Congratulations Zhennan! > > -Original Message- > From: Jun Wu [mailto:wujun@gmail.com] > Sent: Tuesday, April 30, 2019 12:29 PM > To: dev@mxnet.incubator.apache.org > Subject: [Announcement] New Committer - Zhennan Qin > > Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin) > from Intel as a new committer. > > Zhennan is the main author of accelerating MXNet/MKLDNN inference > through operator fusion and model quantization. His work has placed MXNet > in an advantageous place for inference workloads on Intel CPUs compared > with other DL frameworks.
Re: Proposal for Conversion from FP32 to Mixed Precision Models
Hi Tao, I covered in the doc that it is specifically about inference. I can add another section in FAQ to mention why INT8 quantization is not included. Anirudh On Tue, Apr 30, 2019 at 7:59 AM Lv, Tao A wrote: > Thank you Anirudh! I'm just a little surprised that when we talk about > mixed precision model we don't talk about training, and when talk about > inference, INT8 quantization is not mentioned~ > > -Original Message- > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] > Sent: Tuesday, April 30, 2019 8:27 PM > To: dev@mxnet.incubator.apache.org > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models > > Hi Zach, > > I checked the QuantizeGraph pass and I think probably it can benefit from > CSE pass to eliminate additional quantize/quantize_v2 nodes. Having said > that, I think it may still be an overkill to add another NNVM pass to have > a generic common subexpression elimination pass. Currently, this > elimination logic takes only additional 3 to 6 lines of code in each of the > two NNVM pass. Also, a generic common subexpression elimination has its own > associated maintenance costs. I think it is better to continue with the > current approach and revisit this need in the future as we add more NNVM > passes. > > Anirudh > > On Mon, Apr 29, 2019 at 2:22 PM Anirudh Subramanian > > wrote: > > > Hi Zach, > > > > You raise an interesting point. Thank you for the pointer! > > > > Incorporating CSE pass comes with its own cost, and the advantage it > > brings is to make the ReducePrecision nnvm pass more lightweight. > > Since the amortized cost of the ReducePrecision pass is O(1) it > > shouldn't matter much whether we add it or not from performance point > of view. > > > > From maintenance point of view, I would agree that separating these > > two logics can be helpful if we have other such workflows which > > require the original Pass followed by CSE pass. Currently, as far as I > > know only the ReducePrecision pass using it. I will check to see if > > CSE pass can benefit other NNVM pass also like quantization pass apart > > from ReducePrecision, and will get back. > > > > Anirudh > > > > On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg > > > > wrote: > > > >> I have one suggestion. In the current design, there are the > >> additional maps from each input entry to each target casted entry > >> dtype in order to avoid creating duplicate casts. Instead of creating > >> these, another option is to use a general purpose Common > >> Subexpression Elimination (CSE) [1] pass to apply afterwards. So, you > >> would run the mixed precision pass which creates the duplicates and > >> then the CSE pass which would remove all duplicates. > >> > >> This design is common in existing compilers like LLVM because > >> maintaining and testing the passes is much easier when they are kept > >> as simple as possible. The CSE can also be reused as necessary for > >> other passes that could create duplicates or to remove duplicate > expressions in general. > >> This > >> tutorial [2] talks about it a bit. > >> > >> Zach > >> > >> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination > >> [2] - https://blog.regehr.org/archives/1603 > >> > >> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian < > >> anirudh2...@gmail.com> > >> wrote: > >> > >> > Hi Tao, > >> > > >> > Thanks for raising this question! I thought about the existing > >> quantization > >> > workflow and whether it can be included with the AMP API. Although > >> > quantization can be considered as mixed precision, there are > >> differences. > >> > For example, only a small number of operators can be quantized > >> > compared > >> to > >> > the operators that can run in FP16 precision. Thus, overriding the > >> > operators to run in original dtype vs target dtype doesnt make much > >> sense > >> > for quantization. > >> > > >> > Also, quantization workflow may require a calibration dataset to > >> calibrate > >> > the min and max and calib_mode. > >> > Arriving at a common API, for quantization with calibration and > >> > mixed precision inference (FP16 and BF16) may make the API too > >> > complicated and not very easy to use. I understand that this may > >> > cause some confusion as people may try to use target_dtype of int8 > >> > but I think its still better than causing user confusion with the API > usage. > >> > > >> > Also, when we move quantize_model APIs outside contrib we can > >> > consider adding them under AMP namespace. The challenge would then > >> > be to educate users on difference between "quantize" and "convert". > >> > > >> > Anirudh > >> > > >> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A wrote: > >> > > >> > > Thank you for the explanation. Sorry I didn't realize the > >> > > proposal is > >> for > >> > > inference only. > >> > > > >> > > Then how do you think the amp_cast and amp_multicast in this > >> > > proposal > >> can > >> > > work with the existing INT8 quantization workflow wh
Re: [Announcement] New Committer - Zhennan Qin
Congrats Zhennan! Welcome! From: Lv, Tao A Sent: Tuesday, April 30, 2019 11:01:01 PM To: dev@mxnet.incubator.apache.org Subject: RE: [Announcement] New Committer - Zhennan Qin Congratulations Zhennan! -Original Message- From: Jun Wu [mailto:wujun@gmail.com] Sent: Tuesday, April 30, 2019 12:29 PM To: dev@mxnet.incubator.apache.org Subject: [Announcement] New Committer - Zhennan Qin Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin) from Intel as a new committer. Zhennan is the main author of accelerating MXNet/MKLDNN inference through operator fusion and model quantization. His work has placed MXNet in an advantageous place for inference workloads on Intel CPUs compared with other DL frameworks.
Re: [Announcement] New Committer - Hao Jin
Congrats Hao! Welcome! From: Lv, Tao A Sent: Tuesday, April 30, 2019 11:00:33 PM To: dev@mxnet.incubator.apache.org Subject: RE: [Announcement] New Committer - Hao Jin Congratulations Hao! -Original Message- From: Jun Wu [mailto:wujun@gmail.com] Sent: Tuesday, April 30, 2019 12:29 PM To: dev@mxnet.incubator.apache.org Subject: [Announcement] New Committer - Hao Jin Please join me in welcoming Hao Jin (https://github.com/haojin2) from AWS as a new committer. Hao has designed and implemented many sophisticated algorithms for tensor operations. His work has greatly expanded the coverage of MXNet operator inventory and enhanced the performance of many operators that are hard to be optimized. Not only that, Hao has been active in advocating MXNet through providing high-quality translation service for quite a few technical articles and blog posts.
RE: [Announcement] New Committer - Zhennan Qin
Congratulations Zhennan! -Original Message- From: Jun Wu [mailto:wujun@gmail.com] Sent: Tuesday, April 30, 2019 12:29 PM To: dev@mxnet.incubator.apache.org Subject: [Announcement] New Committer - Zhennan Qin Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin) from Intel as a new committer. Zhennan is the main author of accelerating MXNet/MKLDNN inference through operator fusion and model quantization. His work has placed MXNet in an advantageous place for inference workloads on Intel CPUs compared with other DL frameworks.
RE: [Announcement] New Committer - Hao Jin
Congratulations Hao! -Original Message- From: Jun Wu [mailto:wujun@gmail.com] Sent: Tuesday, April 30, 2019 12:29 PM To: dev@mxnet.incubator.apache.org Subject: [Announcement] New Committer - Hao Jin Please join me in welcoming Hao Jin (https://github.com/haojin2) from AWS as a new committer. Hao has designed and implemented many sophisticated algorithms for tensor operations. His work has greatly expanded the coverage of MXNet operator inventory and enhanced the performance of many operators that are hard to be optimized. Not only that, Hao has been active in advocating MXNet through providing high-quality translation service for quite a few technical articles and blog posts.
RE: Proposal for Conversion from FP32 to Mixed Precision Models
Thank you Anirudh! I'm just a little surprised that when we talk about mixed precision model we don't talk about training, and when talk about inference, INT8 quantization is not mentioned~ -Original Message- From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] Sent: Tuesday, April 30, 2019 8:27 PM To: dev@mxnet.incubator.apache.org Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models Hi Zach, I checked the QuantizeGraph pass and I think probably it can benefit from CSE pass to eliminate additional quantize/quantize_v2 nodes. Having said that, I think it may still be an overkill to add another NNVM pass to have a generic common subexpression elimination pass. Currently, this elimination logic takes only additional 3 to 6 lines of code in each of the two NNVM pass. Also, a generic common subexpression elimination has its own associated maintenance costs. I think it is better to continue with the current approach and revisit this need in the future as we add more NNVM passes. Anirudh On Mon, Apr 29, 2019 at 2:22 PM Anirudh Subramanian wrote: > Hi Zach, > > You raise an interesting point. Thank you for the pointer! > > Incorporating CSE pass comes with its own cost, and the advantage it > brings is to make the ReducePrecision nnvm pass more lightweight. > Since the amortized cost of the ReducePrecision pass is O(1) it > shouldn't matter much whether we add it or not from performance point of > view. > > From maintenance point of view, I would agree that separating these > two logics can be helpful if we have other such workflows which > require the original Pass followed by CSE pass. Currently, as far as I > know only the ReducePrecision pass using it. I will check to see if > CSE pass can benefit other NNVM pass also like quantization pass apart > from ReducePrecision, and will get back. > > Anirudh > > On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg > > wrote: > >> I have one suggestion. In the current design, there are the >> additional maps from each input entry to each target casted entry >> dtype in order to avoid creating duplicate casts. Instead of creating >> these, another option is to use a general purpose Common >> Subexpression Elimination (CSE) [1] pass to apply afterwards. So, you >> would run the mixed precision pass which creates the duplicates and >> then the CSE pass which would remove all duplicates. >> >> This design is common in existing compilers like LLVM because >> maintaining and testing the passes is much easier when they are kept >> as simple as possible. The CSE can also be reused as necessary for >> other passes that could create duplicates or to remove duplicate expressions >> in general. >> This >> tutorial [2] talks about it a bit. >> >> Zach >> >> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination >> [2] - https://blog.regehr.org/archives/1603 >> >> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian < >> anirudh2...@gmail.com> >> wrote: >> >> > Hi Tao, >> > >> > Thanks for raising this question! I thought about the existing >> quantization >> > workflow and whether it can be included with the AMP API. Although >> > quantization can be considered as mixed precision, there are >> differences. >> > For example, only a small number of operators can be quantized >> > compared >> to >> > the operators that can run in FP16 precision. Thus, overriding the >> > operators to run in original dtype vs target dtype doesnt make much >> sense >> > for quantization. >> > >> > Also, quantization workflow may require a calibration dataset to >> calibrate >> > the min and max and calib_mode. >> > Arriving at a common API, for quantization with calibration and >> > mixed precision inference (FP16 and BF16) may make the API too >> > complicated and not very easy to use. I understand that this may >> > cause some confusion as people may try to use target_dtype of int8 >> > but I think its still better than causing user confusion with the API >> > usage. >> > >> > Also, when we move quantize_model APIs outside contrib we can >> > consider adding them under AMP namespace. The challenge would then >> > be to educate users on difference between "quantize" and "convert". >> > >> > Anirudh >> > >> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A wrote: >> > >> > > Thank you for the explanation. Sorry I didn't realize the >> > > proposal is >> for >> > > inference only. >> > > >> > > Then how do you think the amp_cast and amp_multicast in this >> > > proposal >> can >> > > work with the existing INT8 quantization workflow which I think >> > > should >> > also >> > > be considered as 'mixed precision'. >> > > >> > > -Original Message- >> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] >> > > Sent: Monday, April 29, 2019 10:25 PM >> > > To: dev@mxnet.incubator.apache.org >> > > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision >> Models >> > > >> > > Hi Tao, >> > > >> > > T
Re: Proposal for Conversion from FP32 to Mixed Precision Models
Hi Zach, I checked the QuantizeGraph pass and I think probably it can benefit from CSE pass to eliminate additional quantize/quantize_v2 nodes. Having said that, I think it may still be an overkill to add another NNVM pass to have a generic common subexpression elimination pass. Currently, this elimination logic takes only additional 3 to 6 lines of code in each of the two NNVM pass. Also, a generic common subexpression elimination has its own associated maintenance costs. I think it is better to continue with the current approach and revisit this need in the future as we add more NNVM passes. Anirudh On Mon, Apr 29, 2019 at 2:22 PM Anirudh Subramanian wrote: > Hi Zach, > > You raise an interesting point. Thank you for the pointer! > > Incorporating CSE pass comes with its own cost, and the advantage it > brings is to make the ReducePrecision nnvm pass more lightweight. Since the > amortized cost of the ReducePrecision pass is O(1) it shouldn't matter much > whether we add it or not from performance point of view. > > From maintenance point of view, I would agree that separating these two > logics can be helpful if we have other such workflows which require the > original Pass followed by CSE pass. Currently, as far as I know only the > ReducePrecision pass using it. I will check to see if CSE pass can benefit > other NNVM pass also like quantization pass apart from ReducePrecision, and > will get back. > > Anirudh > > On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg > wrote: > >> I have one suggestion. In the current design, there are the additional >> maps >> from each input entry to each target casted entry dtype in order to avoid >> creating duplicate casts. Instead of creating these, another option is to >> use a general purpose Common Subexpression Elimination (CSE) [1] pass to >> apply afterwards. So, you would run the mixed precision pass which creates >> the duplicates and then the CSE pass which would remove all duplicates. >> >> This design is common in existing compilers like LLVM because maintaining >> and testing the passes is much easier when they are kept as simple as >> possible. The CSE can also be reused as necessary for other passes that >> could create duplicates or to remove duplicate expressions in general. >> This >> tutorial [2] talks about it a bit. >> >> Zach >> >> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination >> [2] - https://blog.regehr.org/archives/1603 >> >> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian < >> anirudh2...@gmail.com> >> wrote: >> >> > Hi Tao, >> > >> > Thanks for raising this question! I thought about the existing >> quantization >> > workflow and whether it can be included with the AMP API. Although >> > quantization can be considered as mixed precision, there are >> differences. >> > For example, only a small number of operators can be quantized compared >> to >> > the operators that can run in FP16 precision. Thus, overriding the >> > operators to run in original dtype vs target dtype doesnt make much >> sense >> > for quantization. >> > >> > Also, quantization workflow may require a calibration dataset to >> calibrate >> > the min and max and calib_mode. >> > Arriving at a common API, for quantization with calibration and mixed >> > precision inference (FP16 and BF16) may make the API too complicated and >> > not very easy to use. I understand that this may cause some confusion as >> > people may try to use target_dtype of int8 but I think its still better >> > than causing user confusion with the API usage. >> > >> > Also, when we move quantize_model APIs outside contrib we can consider >> > adding them under AMP namespace. The challenge would then be to educate >> > users on difference between "quantize" and "convert". >> > >> > Anirudh >> > >> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A wrote: >> > >> > > Thank you for the explanation. Sorry I didn't realize the proposal is >> for >> > > inference only. >> > > >> > > Then how do you think the amp_cast and amp_multicast in this proposal >> can >> > > work with the existing INT8 quantization workflow which I think should >> > also >> > > be considered as 'mixed precision'. >> > > >> > > -Original Message- >> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] >> > > Sent: Monday, April 29, 2019 10:25 PM >> > > To: dev@mxnet.incubator.apache.org >> > > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision >> Models >> > > >> > > Hi Tao, >> > > >> > > The APIs proposed: "convert_model" and "convert_block" are mainly for >> > > inference use cases, where customers bring a FP32 model to convert it >> to >> > a >> > > mixed precision model to get improved performance while not losing >> out on >> > > the accuracy. >> > > The PR: https://github.com/apache/incubator-mxnet/pull/14173 is >> supposed >> > > to handle the training use cases and this proposal doesn't cover the >> AMP >> > > feature added in the PR. I think ptrendx@ and canoerst@ are better