Re: Proposal to make MKLDNN as default CPU backend
Hi, as the respective PR has been open for a while and as there has been no follow- up to Patric's mail, I suggest to merge it once CI passes after Tao's conflict resolution earlier today. This gives community members time to test for regressions prior to the 1.7 release. If such were found, we can reconsider the decision. Best regards Leonard [1]: https://github.com/apache/incubator-mxnet/pull/16899 On Wed, 2019-11-20 at 05:27 +, Zhao, Patric wrote: > Thanks all of the great suggestions. > > Regarding the binary release, including w/o MKLDNN build, I have summarized a > table (check attachment). > > - Major changes in python packages, see attached table. > - Switch on MKLDNN for no mkl suffix binary in release 1.7 (Red check mark) > - Add new mxnet-native build w/o MKLDNN and cuDNN (Yellow background) > Track the usage/download in 1-2 releases and then decide if we need it for a > long time > - Drop all mkl suffix binary in next major release v2.x. > > Thanks, > > --Patric > > > -Original Message- > > From: Lin Yuan > > Sent: Wednesday, November 20, 2019 5:40 AM > > To: dev@mxnet.incubator.apache.org > > Cc: Tao Lv > > Subject: Re: Proposal to make MKLDNN as default CPU backend > > > > Also per Sam's suggestion, we could still release a build without MKLDNN > > (name it mxnet-nomkldnn?) and track the usage/download for one or two > > releases. If there is no usage, we could drop that build in the future. > > > > Best, > > > > Lin > > > > On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan wrote: > > > > > Just to summarize base on the concerns Marco raised and discussed > > abvove: > > > - AMD CPU (it should work with MKLDNN: > > > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+M > > KL > > > -DNN+-+Performance+Benchmarking > > > ) > > > - ARM CPU (we don't have it today w/o MKLDNN either) > > > - Windows (Windows support is there regardless of MKLDNN or not) > > > - GPU and MKLDNN enabled (already supported) > > > - Fully reproducible results (medical and financial sector requested > > > that and we have some flags for cuda) (The nondeterminism exists even > > > today w/o MKLDNN. We should address it regardless of MLKDNN) > > > > > > Marco, please let us know if your concerns are properly addressed? > > > > > > Given that MKLDNN gives significant performance speed up in CPU, I am > > > inclined to make it default in pip build. > > > > > > Best, > > > > > > Lin > > > > > > On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier > > > wrote: > > > > > > > Thanks, Patric. I was just trying to point out that there was > > > > currently no guarantee of deterministic results without MKL, so > > > > there’s not necessarily an expectation of determinism with MKL (ie > > requirement isn’t relaxed). > > > > On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric > > > > wrote: > > > > > > > > > It may be a concern but little noise can't affect the final results > > > > > if > > > > the > > > > > algorithm is stable in numerical. > > > > > The MKLDNN backend with mxnet-mkl has been used for 2 years and > > we > > > > didn't > > > > > see the coverage issue caused by multiple threading. > > > > > In other words, GPU programming mode works well on training where > > > > > the non-deterministic also exists from multiple threads. > > > > > > > > > > Parts of training accuracy was pasted in the first PR when MKLDNN > > > > > is integrated. > > > > > > > > > https://github.com/apache/incubator-mxnet/pull/8302#issuecomment- > > 3596 > > > > 74818 > > > > > In conclusion, it may happen with very little probability. I > > > > > believe we can get a solution in case it happens someday. > > > > > > > > > > Thanks, > > > > > > > > > > --Patric > > > > > > > > > > > > > > > > -Original Message- > > > > > > From: Chris Olivier > > > > > > Sent: Tuesday, November 19, 2019 11:51 AM > > > > > > To: dev@mxnet.incubator.apache.org > > > > > > Cc: Tao Lv > > > > > > Subject: Re: Proposal to make MKLDNN as default CPU backend > > > > > > > > > > >
RE: Proposal to make MKLDNN as default CPU backend
Thanks all of the great suggestions. Regarding the binary release, including w/o MKLDNN build, I have summarized a table (check attachment). - Major changes in python packages, see attached table. - Switch on MKLDNN for no mkl suffix binary in release 1.7 (Red check mark) - Add new mxnet-native build w/o MKLDNN and cuDNN (Yellow background) Track the usage/download in 1-2 releases and then decide if we need it for a long time - Drop all mkl suffix binary in next major release v2.x. Thanks, --Patric > -Original Message- > From: Lin Yuan > Sent: Wednesday, November 20, 2019 5:40 AM > To: dev@mxnet.incubator.apache.org > Cc: Tao Lv > Subject: Re: Proposal to make MKLDNN as default CPU backend > > Also per Sam's suggestion, we could still release a build without MKLDNN > (name it mxnet-nomkldnn?) and track the usage/download for one or two > releases. If there is no usage, we could drop that build in the future. > > Best, > > Lin > > On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan wrote: > > > Just to summarize base on the concerns Marco raised and discussed > abvove: > > > > - AMD CPU (it should work with MKLDNN: > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+M > KL > > -DNN+-+Performance+Benchmarking > > ) > > - ARM CPU (we don't have it today w/o MKLDNN either) > > - Windows (Windows support is there regardless of MKLDNN or not) > > - GPU and MKLDNN enabled (already supported) > > - Fully reproducible results (medical and financial sector requested > > that and we have some flags for cuda) (The nondeterminism exists even > > today w/o MKLDNN. We should address it regardless of MLKDNN) > > > > Marco, please let us know if your concerns are properly addressed? > > > > Given that MKLDNN gives significant performance speed up in CPU, I am > > inclined to make it default in pip build. > > > > Best, > > > > Lin > > > > On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier > > wrote: > > > >> Thanks, Patric. I was just trying to point out that there was > >> currently no guarantee of deterministic results without MKL, so > >> there’s not necessarily an expectation of determinism with MKL (ie > requirement isn’t relaxed). > >> > >> On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric > >> wrote: > >> > >> > It may be a concern but little noise can't affect the final results > >> > if > >> the > >> > algorithm is stable in numerical. > >> > The MKLDNN backend with mxnet-mkl has been used for 2 years and > we > >> didn't > >> > see the coverage issue caused by multiple threading. > >> > In other words, GPU programming mode works well on training where > >> > the non-deterministic also exists from multiple threads. > >> > > >> > Parts of training accuracy was pasted in the first PR when MKLDNN > >> > is integrated. > >> > > >> https://github.com/apache/incubator-mxnet/pull/8302#issuecomment- > 3596 > >> 74818 > >> > > >> > In conclusion, it may happen with very little probability. I > >> > believe we can get a solution in case it happens someday. > >> > > >> > Thanks, > >> > > >> > --Patric > >> > > >> > > >> > > -Original Message- > >> > > From: Chris Olivier > >> > > Sent: Tuesday, November 19, 2019 11:51 AM > >> > > To: dev@mxnet.incubator.apache.org > >> > > Cc: Tao Lv > >> > > Subject: Re: Proposal to make MKLDNN as default CPU backend > >> > > > >> > > (for non mkl dropout, for instance) > >> > > > >> > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier > >> > > > >> > > wrote: > >> > > > >> > > > To address the deterministic item, I know for a fact that > >> > > > training will not be deterministic in some cases where the “parallel > random” > >> > > > class is utilized in parallel threads, such as OMP, if the > >> > > > number of cores is different, even with the same seed, because > >> > > > threads are seeded independently and different number of > >> > > > threads will end up generating different random number > >> > > > sequences. Dropout operator being > >> > > an example. > >> > > > > >> > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque >
Re: Proposal to make MKLDNN as default CPU backend
Also per Sam's suggestion, we could still release a build without MKLDNN (name it mxnet-nomkldnn?) and track the usage/download for one or two releases. If there is no usage, we could drop that build in the future. Best, Lin On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan wrote: > Just to summarize base on the concerns Marco raised and discussed abvove: > > - AMD CPU (it should work with MKLDNN: > https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking > ) > - ARM CPU (we don't have it today w/o MKLDNN either) > - Windows (Windows support is there regardless of MKLDNN or not) > - GPU and MKLDNN enabled (already supported) > - Fully reproducible results (medical and financial sector requested that > and we have some flags for cuda) (The nondeterminism exists even today w/o > MKLDNN. We should address it regardless of MLKDNN) > > Marco, please let us know if your concerns are properly addressed? > > Given that MKLDNN gives significant performance speed up in CPU, I am > inclined to make it default in pip build. > > Best, > > Lin > > On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier > wrote: > >> Thanks, Patric. I was just trying to point out that there was currently no >> guarantee of deterministic results without MKL, so there’s not necessarily >> an expectation of determinism with MKL (ie requirement isn’t relaxed). >> >> On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric >> wrote: >> >> > It may be a concern but little noise can't affect the final results if >> the >> > algorithm is stable in numerical. >> > The MKLDNN backend with mxnet-mkl has been used for 2 years and we >> didn't >> > see the coverage issue caused by multiple threading. >> > In other words, GPU programming mode works well on training where the >> > non-deterministic also exists from multiple threads. >> > >> > Parts of training accuracy was pasted in the first PR when MKLDNN is >> > integrated. >> > >> https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-359674818 >> > >> > In conclusion, it may happen with very little probability. I believe we >> > can get a solution in case it happens someday. >> > >> > Thanks, >> > >> > --Patric >> > >> > >> > > -Original Message- >> > > From: Chris Olivier >> > > Sent: Tuesday, November 19, 2019 11:51 AM >> > > To: dev@mxnet.incubator.apache.org >> > > Cc: Tao Lv >> > > Subject: Re: Proposal to make MKLDNN as default CPU backend >> > > >> > > (for non mkl dropout, for instance) >> > > >> > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier >> > > wrote: >> > > >> > > > To address the deterministic item, I know for a fact that training >> > > > will not be deterministic in some cases where the “parallel random” >> > > > class is utilized in parallel threads, such as OMP, if the number of >> > > > cores is different, even with the same seed, because threads are >> > > > seeded independently and different number of threads will end up >> > > > generating different random number sequences. Dropout operator being >> > > an example. >> > > > >> > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque >> > > > wrote: >> > > > >> > > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN >> > > >> would be enabled by default. Historically, other intel libraries >> > > >> (along with the ICC >> > > >> compiler) have had performance issues on AMD CPUs. It’s just worth >> > > >> double checking to make sure that’s not the case here. Perhaps some >> > > >> MKL-DNN authors can chime in though. It’s not sufficient to double >> > > >> check that an >> > > >> AVX2 package passes tests. >> > > >> >> > > >> Agreed in the case we’re not releasing ARM binaries. >> > > >> >> > > >> The reproducibility argument is around the results being >> numerically >> > > >> reproducible. That is, eg; if I train a model with some fixed set >> of >> > > >> data, some random seed, etc. and then run inference on it do I get >> > > >> the exact same floating point values for the weights and results? >> > > >> Does MxNet already offer this without MKL-DNN? >> > >
Re: Proposal to make MKLDNN as default CPU backend
Just to summarize base on the concerns Marco raised and discussed abvove: - AMD CPU (it should work with MKLDNN: https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking ) - ARM CPU (we don't have it today w/o MKLDNN either) - Windows (Windows support is there regardless of MKLDNN or not) - GPU and MKLDNN enabled (already supported) - Fully reproducible results (medical and financial sector requested that and we have some flags for cuda) (The nondeterminism exists even today w/o MKLDNN. We should address it regardless of MLKDNN) Marco, please let us know if your concerns are properly addressed? Given that MKLDNN gives significant performance speed up in CPU, I am inclined to make it default in pip build. Best, Lin On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier wrote: > Thanks, Patric. I was just trying to point out that there was currently no > guarantee of deterministic results without MKL, so there’s not necessarily > an expectation of determinism with MKL (ie requirement isn’t relaxed). > > On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric > wrote: > > > It may be a concern but little noise can't affect the final results if > the > > algorithm is stable in numerical. > > The MKLDNN backend with mxnet-mkl has been used for 2 years and we didn't > > see the coverage issue caused by multiple threading. > > In other words, GPU programming mode works well on training where the > > non-deterministic also exists from multiple threads. > > > > Parts of training accuracy was pasted in the first PR when MKLDNN is > > integrated. > > > https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-359674818 > > > > In conclusion, it may happen with very little probability. I believe we > > can get a solution in case it happens someday. > > > > Thanks, > > > > --Patric > > > > > > > -Original Message- > > > From: Chris Olivier > > > Sent: Tuesday, November 19, 2019 11:51 AM > > > To: dev@mxnet.incubator.apache.org > > > Cc: Tao Lv > > > Subject: Re: Proposal to make MKLDNN as default CPU backend > > > > > > (for non mkl dropout, for instance) > > > > > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier > > > wrote: > > > > > > > To address the deterministic item, I know for a fact that training > > > > will not be deterministic in some cases where the “parallel random” > > > > class is utilized in parallel threads, such as OMP, if the number of > > > > cores is different, even with the same seed, because threads are > > > > seeded independently and different number of threads will end up > > > > generating different random number sequences. Dropout operator being > > > an example. > > > > > > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque > > > > wrote: > > > > > > > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN > > > >> would be enabled by default. Historically, other intel libraries > > > >> (along with the ICC > > > >> compiler) have had performance issues on AMD CPUs. It’s just worth > > > >> double checking to make sure that’s not the case here. Perhaps some > > > >> MKL-DNN authors can chime in though. It’s not sufficient to double > > > >> check that an > > > >> AVX2 package passes tests. > > > >> > > > >> Agreed in the case we’re not releasing ARM binaries. > > > >> > > > >> The reproducibility argument is around the results being numerically > > > >> reproducible. That is, eg; if I train a model with some fixed set of > > > >> data, some random seed, etc. and then run inference on it do I get > > > >> the exact same floating point values for the weights and results? > > > >> Does MxNet already offer this without MKL-DNN? > > > >> > > > >> On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) > > > wrote: > > > >> > > > >> Regarding the cases listed by Marco: > > > >> - AMD CPU > > > >> From my architecture knowledge, what works on C4 instances (with > AVX2 > > > >> support) should also work well on m5a, right? I think mxnet-mkl and > > > >> mxnet-cuxxmkl packages have been fully validated on AVX2 machines. > > > >> Also, we didn't perform any validation on AMD CPU before, why we > need > > > >> do that for this time? > > > >>
Re: Proposal to make MKLDNN as default CPU backend
Thanks, Patric. I was just trying to point out that there was currently no guarantee of deterministic results without MKL, so there’s not necessarily an expectation of determinism with MKL (ie requirement isn’t relaxed). On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric wrote: > It may be a concern but little noise can't affect the final results if the > algorithm is stable in numerical. > The MKLDNN backend with mxnet-mkl has been used for 2 years and we didn't > see the coverage issue caused by multiple threading. > In other words, GPU programming mode works well on training where the > non-deterministic also exists from multiple threads. > > Parts of training accuracy was pasted in the first PR when MKLDNN is > integrated. > https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-359674818 > > In conclusion, it may happen with very little probability. I believe we > can get a solution in case it happens someday. > > Thanks, > > --Patric > > > > -Original Message- > > From: Chris Olivier > > Sent: Tuesday, November 19, 2019 11:51 AM > > To: dev@mxnet.incubator.apache.org > > Cc: Tao Lv > > Subject: Re: Proposal to make MKLDNN as default CPU backend > > > > (for non mkl dropout, for instance) > > > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier > > wrote: > > > > > To address the deterministic item, I know for a fact that training > > > will not be deterministic in some cases where the “parallel random” > > > class is utilized in parallel threads, such as OMP, if the number of > > > cores is different, even with the same seed, because threads are > > > seeded independently and different number of threads will end up > > > generating different random number sequences. Dropout operator being > > an example. > > > > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque > > > wrote: > > > > > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN > > >> would be enabled by default. Historically, other intel libraries > > >> (along with the ICC > > >> compiler) have had performance issues on AMD CPUs. It’s just worth > > >> double checking to make sure that’s not the case here. Perhaps some > > >> MKL-DNN authors can chime in though. It’s not sufficient to double > > >> check that an > > >> AVX2 package passes tests. > > >> > > >> Agreed in the case we’re not releasing ARM binaries. > > >> > > >> The reproducibility argument is around the results being numerically > > >> reproducible. That is, eg; if I train a model with some fixed set of > > >> data, some random seed, etc. and then run inference on it do I get > > >> the exact same floating point values for the weights and results? > > >> Does MxNet already offer this without MKL-DNN? > > >> > > >> On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) > > wrote: > > >> > > >> Regarding the cases listed by Marco: > > >> - AMD CPU > > >> From my architecture knowledge, what works on C4 instances (with AVX2 > > >> support) should also work well on m5a, right? I think mxnet-mkl and > > >> mxnet-cuxxmkl packages have been fully validated on AVX2 machines. > > >> Also, we didn't perform any validation on AMD CPU before, why we need > > >> do that for this time? > > >> > > >> - ARM CPU > > >> I don't know we're releasing any convenience binaries for ARM CPU. > > >> This proposal mainly targets those pypi packages. > > >> > > >> - Windows > > >> Already validated by CI. We're also releasing mxnet-mkl packages for > Win. > > >> > > >> - GPU and MKLDNN enabled > > >> Already validated by CI and mxnet-cuxxmkl packages have been released > > >> for several versions. > > >> > > >> - Fully reproducible results (medical and financial sector requested > > >> that and we have some flags for cuda) Not sure I understand this > > >> case. We already have MKL-DNN backend for a while. Functionality and > > >> correctness of it have been verified by MXNet users. > > >> > > >> -tao > > >> > > >> On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu > > >> > > >> wrote: > > >> > > >> > Sorry, my intent with the "non-standard" phrase was not about > > >> > general > > >> MXNet > > >
RE: Proposal to make MKLDNN as default CPU backend
It may be a concern but little noise can't affect the final results if the algorithm is stable in numerical. The MKLDNN backend with mxnet-mkl has been used for 2 years and we didn't see the coverage issue caused by multiple threading. In other words, GPU programming mode works well on training where the non-deterministic also exists from multiple threads. Parts of training accuracy was pasted in the first PR when MKLDNN is integrated. https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-359674818 In conclusion, it may happen with very little probability. I believe we can get a solution in case it happens someday. Thanks, --Patric > -Original Message- > From: Chris Olivier > Sent: Tuesday, November 19, 2019 11:51 AM > To: dev@mxnet.incubator.apache.org > Cc: Tao Lv > Subject: Re: Proposal to make MKLDNN as default CPU backend > > (for non mkl dropout, for instance) > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier > wrote: > > > To address the deterministic item, I know for a fact that training > > will not be deterministic in some cases where the “parallel random” > > class is utilized in parallel threads, such as OMP, if the number of > > cores is different, even with the same seed, because threads are > > seeded independently and different number of threads will end up > > generating different random number sequences. Dropout operator being > an example. > > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque > > wrote: > > > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN > >> would be enabled by default. Historically, other intel libraries > >> (along with the ICC > >> compiler) have had performance issues on AMD CPUs. It’s just worth > >> double checking to make sure that’s not the case here. Perhaps some > >> MKL-DNN authors can chime in though. It’s not sufficient to double > >> check that an > >> AVX2 package passes tests. > >> > >> Agreed in the case we’re not releasing ARM binaries. > >> > >> The reproducibility argument is around the results being numerically > >> reproducible. That is, eg; if I train a model with some fixed set of > >> data, some random seed, etc. and then run inference on it do I get > >> the exact same floating point values for the weights and results? > >> Does MxNet already offer this without MKL-DNN? > >> > >> On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) > wrote: > >> > >> Regarding the cases listed by Marco: > >> - AMD CPU > >> From my architecture knowledge, what works on C4 instances (with AVX2 > >> support) should also work well on m5a, right? I think mxnet-mkl and > >> mxnet-cuxxmkl packages have been fully validated on AVX2 machines. > >> Also, we didn't perform any validation on AMD CPU before, why we need > >> do that for this time? > >> > >> - ARM CPU > >> I don't know we're releasing any convenience binaries for ARM CPU. > >> This proposal mainly targets those pypi packages. > >> > >> - Windows > >> Already validated by CI. We're also releasing mxnet-mkl packages for Win. > >> > >> - GPU and MKLDNN enabled > >> Already validated by CI and mxnet-cuxxmkl packages have been released > >> for several versions. > >> > >> - Fully reproducible results (medical and financial sector requested > >> that and we have some flags for cuda) Not sure I understand this > >> case. We already have MKL-DNN backend for a while. Functionality and > >> correctness of it have been verified by MXNet users. > >> > >> -tao > >> > >> On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu > >> > >> wrote: > >> > >> > Sorry, my intent with the "non-standard" phrase was not about > >> > general > >> MXNet > >> > but rather from MKLDNNs point of view, considering that it's being > >> > developed by Intel, I assumed that MKLDNN might consider non-intel > >> > use-cases non standard. > >> > > >> > -Marco > >> > > >> > Skalicky, Sam schrieb am Mo., 18. Nov. > >> 2019, > >> > 21:34: > >> > > >> > > Thanks Alfredo, if you can create a GitHub issue with notes/steps > >> > > we > >> can > >> > > add this to the todo list for integrating with the MXNet CI to > >> > > test on > >> > m5a > >> > > instances too.
Re: Proposal to make MKLDNN as default CPU backend
(for non mkl dropout, for instance) On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier wrote: > To address the deterministic item, I know for a fact that training will > not be deterministic in some cases where the “parallel random” class is > utilized in parallel threads, such as OMP, if the number of cores is > different, even with the same seed, because threads are seeded > independently and different number of threads will end up generating > different random number sequences. Dropout operator being an example. > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque > wrote: > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN would >> be >> enabled by default. Historically, other intel libraries (along with the >> ICC >> compiler) have had performance issues on AMD CPUs. It’s just worth double >> checking to make sure that’s not the case here. Perhaps some MKL-DNN >> authors can chime in though. It’s not sufficient to double check that an >> AVX2 package passes tests. >> >> Agreed in the case we’re not releasing ARM binaries. >> >> The reproducibility argument is around the results being numerically >> reproducible. That is, eg; if I train a model with some fixed set of data, >> some random seed, etc. and then run inference on it do I get the exact >> same >> floating point values for the weights and results? Does MxNet already >> offer >> this without MKL-DNN? >> >> On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) wrote: >> >> Regarding the cases listed by Marco: >> - AMD CPU >> From my architecture knowledge, what works on C4 instances (with AVX2 >> support) should also work well on m5a, right? I think mxnet-mkl and >> mxnet-cuxxmkl packages have been fully validated on AVX2 machines. >> Also, we didn't perform any validation on AMD CPU before, why we need do >> that for this time? >> >> - ARM CPU >> I don't know we're releasing any convenience binaries for ARM CPU. This >> proposal mainly targets those pypi packages. >> >> - Windows >> Already validated by CI. We're also releasing mxnet-mkl packages for Win. >> >> - GPU and MKLDNN enabled >> Already validated by CI and mxnet-cuxxmkl packages have been released for >> several versions. >> >> - Fully reproducible results (medical and financial sector requested that >> and we have some flags for cuda) >> Not sure I understand this case. We already have MKL-DNN backend for a >> while. Functionality and correctness of it have been verified by MXNet >> users. >> >> -tao >> >> On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu >> wrote: >> >> > Sorry, my intent with the "non-standard" phrase was not about general >> MXNet >> > but rather from MKLDNNs point of view, considering that it's being >> > developed by Intel, I assumed that MKLDNN might consider non-intel >> > use-cases non standard. >> > >> > -Marco >> > >> > Skalicky, Sam schrieb am Mo., 18. Nov. >> 2019, >> > 21:34: >> > >> > > Thanks Alfredo, if you can create a GitHub issue with notes/steps we >> can >> > > add this to the todo list for integrating with the MXNet CI to test on >> > m5a >> > > instances too. Then we can start tracking this on a regular basis. It >> > would >> > > be great to actually test on ARM instances now that AWS has A1 >> instances >> > > too…..ill add it to the wish list ;-D >> > > >> > > Sam >> > > >> > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque < >> alfredo.lu...@airbnb.com >> > .INVALID> >> > > wrote: >> > > > >> > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first >> > > > generation AMD Threadripper Gen 1 if someone has something easy to >> run >> > > and >> > > > representative. >> > > > >> > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( >> > > > sska...@amazon.com.invalid) wrote: >> > > > >> > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? >> Or >> > is >> > > > there someone else in the mxnet dev@ community who can help? >> > > > >> > > > Sam >> > > > >> > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque >> > > > wrote: >> > > >> >> > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / >> Epyc) >> > > > would >> > > >> definitely make sense as a requirement. It seems odd to classify >> that >> > as >> > > > a >> > > >> “nonstandard” use case. >> > > >> >> > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( >> > > >> sska...@amazon.com.invalid) wrote: >> > > >> >> > > >> Thanks Patric & team for your work over the years to make MXNet >> fast >> > > with >> > > >> MKLDNN! >> > > >> >> > > >> I think it would be great to make MKLDNN enabled by default. We >> will >> > > need >> > > >> to continue producing variants without MKLDNN for those who don’t >> want >> > > it >> > > >> (Marco enumerated some use cases). How do you propose to identify >> the >> > > pip >> > > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and >> > > > mxnet-cu101mkl >> > > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN >> what >> > do >> > > > you
Re: Proposal to make MKLDNN as default CPU backend
To address the deterministic item, I know for a fact that training will not be deterministic in some cases where the “parallel random” class is utilized in parallel threads, such as OMP, if the number of cores is different, even with the same seed, because threads are seeded independently and different number of threads will end up generating different random number sequences. Dropout operator being an example. On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque wrote: > For AMD CPUs, you’d want to perform validation because now MKL-DNN would be > enabled by default. Historically, other intel libraries (along with the ICC > compiler) have had performance issues on AMD CPUs. It’s just worth double > checking to make sure that’s not the case here. Perhaps some MKL-DNN > authors can chime in though. It’s not sufficient to double check that an > AVX2 package passes tests. > > Agreed in the case we’re not releasing ARM binaries. > > The reproducibility argument is around the results being numerically > reproducible. That is, eg; if I train a model with some fixed set of data, > some random seed, etc. and then run inference on it do I get the exact same > floating point values for the weights and results? Does MxNet already offer > this without MKL-DNN? > > On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) wrote: > > Regarding the cases listed by Marco: > - AMD CPU > From my architecture knowledge, what works on C4 instances (with AVX2 > support) should also work well on m5a, right? I think mxnet-mkl and > mxnet-cuxxmkl packages have been fully validated on AVX2 machines. > Also, we didn't perform any validation on AMD CPU before, why we need do > that for this time? > > - ARM CPU > I don't know we're releasing any convenience binaries for ARM CPU. This > proposal mainly targets those pypi packages. > > - Windows > Already validated by CI. We're also releasing mxnet-mkl packages for Win. > > - GPU and MKLDNN enabled > Already validated by CI and mxnet-cuxxmkl packages have been released for > several versions. > > - Fully reproducible results (medical and financial sector requested that > and we have some flags for cuda) > Not sure I understand this case. We already have MKL-DNN backend for a > while. Functionality and correctness of it have been verified by MXNet > users. > > -tao > > On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu > wrote: > > > Sorry, my intent with the "non-standard" phrase was not about general > MXNet > > but rather from MKLDNNs point of view, considering that it's being > > developed by Intel, I assumed that MKLDNN might consider non-intel > > use-cases non standard. > > > > -Marco > > > > Skalicky, Sam schrieb am Mo., 18. Nov. > 2019, > > 21:34: > > > > > Thanks Alfredo, if you can create a GitHub issue with notes/steps we > can > > > add this to the todo list for integrating with the MXNet CI to test on > > m5a > > > instances too. Then we can start tracking this on a regular basis. It > > would > > > be great to actually test on ARM instances now that AWS has A1 > instances > > > too…..ill add it to the wish list ;-D > > > > > > Sam > > > > > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque < > alfredo.lu...@airbnb.com > > .INVALID> > > > wrote: > > > > > > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > > > > generation AMD Threadripper Gen 1 if someone has something easy to > run > > > and > > > > representative. > > > > > > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > > > > sska...@amazon.com.invalid) wrote: > > > > > > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or > > is > > > > there someone else in the mxnet dev@ community who can help? > > > > > > > > Sam > > > > > > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > > > > wrote: > > > >> > > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > > > > would > > > >> definitely make sense as a requirement. It seems odd to classify > that > > as > > > > a > > > >> “nonstandard” use case. > > > >> > > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > > > >> sska...@amazon.com.invalid) wrote: > > > >> > > > >> Thanks Patric & team for your work over the years to make MXNet fast > > > with > > > >> MKLDNN! > > > >> > > > >> I think it would be great to make MKLDNN enabled by default. We will > > > need > > > >> to continue producing variants without MKLDNN for those who don’t > want > > > it > > > >> (Marco enumerated some use cases). How do you propose to identify > the > > > pip > > > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > > > > mxnet-cu101mkl > > > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what > > do > > > > you > > > >> propose we call the build without MKLDNN? mxnet-nomkl? > > > >> > > > >> Thanks! > > > >> Sam > > > >> > > > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu < > > marco.g.ab...@gmail.com> > > > >> wrote: > > > >>> > > > >>> Hi Patric, > > > >>> > >
RE: Proposal to make MKLDNN as default CPU backend
Thanks for all great inputs. Regarding AMD, these're some data in the wiki. https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking > -Original Message- > From: Alfredo Luque > Sent: Tuesday, November 19, 2019 10:40 AM > To: Tao Lv ; dev@mxnet.incubator.apache.org > Subject: Re: Proposal to make MKLDNN as default CPU backend > > For AMD CPUs, you’d want to perform validation because now MKL-DNN > would be enabled by default. Historically, other intel libraries (along with > the > ICC > compiler) have had performance issues on AMD CPUs. It’s just worth double > checking to make sure that’s not the case here. Perhaps some MKL-DNN > authors can chime in though. It’s not sufficient to double check that an > AVX2 package passes tests. > > Agreed in the case we’re not releasing ARM binaries. > > The reproducibility argument is around the results being numerically > reproducible. That is, eg; if I train a model with some fixed set of data, > some > random seed, etc. and then run inference on it do I get the exact same > floating point values for the weights and results? Does MxNet already offer > this without MKL-DNN? > > On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) wrote: > > Regarding the cases listed by Marco: > - AMD CPU > From my architecture knowledge, what works on C4 instances (with AVX2 > support) should also work well on m5a, right? I think mxnet-mkl and mxnet- > cuxxmkl packages have been fully validated on AVX2 machines. > Also, we didn't perform any validation on AMD CPU before, why we need do > that for this time? > > - ARM CPU > I don't know we're releasing any convenience binaries for ARM CPU. This > proposal mainly targets those pypi packages. > > - Windows > Already validated by CI. We're also releasing mxnet-mkl packages for Win. > > - GPU and MKLDNN enabled > Already validated by CI and mxnet-cuxxmkl packages have been released for > several versions. > > - Fully reproducible results (medical and financial sector requested that and > we have some flags for cuda) Not sure I understand this case. We already > have MKL-DNN backend for a while. Functionality and correctness of it have > been verified by MXNet users. > > -tao > > On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu > > wrote: > > > Sorry, my intent with the "non-standard" phrase was not about general > MXNet > > but rather from MKLDNNs point of view, considering that it's being > > developed by Intel, I assumed that MKLDNN might consider non-intel > > use-cases non standard. > > > > -Marco > > > > Skalicky, Sam schrieb am Mo., 18. Nov. > > 2019, > > 21:34: > > > > > Thanks Alfredo, if you can create a GitHub issue with notes/steps we > can > > > add this to the todo list for integrating with the MXNet CI to test > > > on > > m5a > > > instances too. Then we can start tracking this on a regular basis. > > > It > > would > > > be great to actually test on ARM instances now that AWS has A1 > instances > > > too…..ill add it to the wish list ;-D > > > > > > Sam > > > > > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque > > > > > .INVALID> > > > wrote: > > > > > > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and > > > > first generation AMD Threadripper Gen 1 if someone has something > > > > easy to > run > > > and > > > > representative. > > > > > > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > > > > sska...@amazon.com.invalid) wrote: > > > > > > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? > > > > Or > > is > > > > there someone else in the mxnet dev@ community who can help? > > > > > > > > Sam > > > > > > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > > > > wrote: > > > >> > > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / > > > >> Epyc) > > > > would > > > >> definitely make sense as a requirement. It seems odd to classify > that > > as > > > > a > > > >> “nonstandard” use case. > > > >> > > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > > > >> sska...@amazon.com.invalid) wrote: > > > >> > > > >> Thanks Patric & team for your work over the years to make MXNet &
Re: Proposal to make MKLDNN as default CPU backend
For AMD CPUs, you’d want to perform validation because now MKL-DNN would be enabled by default. Historically, other intel libraries (along with the ICC compiler) have had performance issues on AMD CPUs. It’s just worth double checking to make sure that’s not the case here. Perhaps some MKL-DNN authors can chime in though. It’s not sufficient to double check that an AVX2 package passes tests. Agreed in the case we’re not releasing ARM binaries. The reproducibility argument is around the results being numerically reproducible. That is, eg; if I train a model with some fixed set of data, some random seed, etc. and then run inference on it do I get the exact same floating point values for the weights and results? Does MxNet already offer this without MKL-DNN? On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com) wrote: Regarding the cases listed by Marco: - AMD CPU >From my architecture knowledge, what works on C4 instances (with AVX2 support) should also work well on m5a, right? I think mxnet-mkl and mxnet-cuxxmkl packages have been fully validated on AVX2 machines. Also, we didn't perform any validation on AMD CPU before, why we need do that for this time? - ARM CPU I don't know we're releasing any convenience binaries for ARM CPU. This proposal mainly targets those pypi packages. - Windows Already validated by CI. We're also releasing mxnet-mkl packages for Win. - GPU and MKLDNN enabled Already validated by CI and mxnet-cuxxmkl packages have been released for several versions. - Fully reproducible results (medical and financial sector requested that and we have some flags for cuda) Not sure I understand this case. We already have MKL-DNN backend for a while. Functionality and correctness of it have been verified by MXNet users. -tao On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu wrote: > Sorry, my intent with the "non-standard" phrase was not about general MXNet > but rather from MKLDNNs point of view, considering that it's being > developed by Intel, I assumed that MKLDNN might consider non-intel > use-cases non standard. > > -Marco > > Skalicky, Sam schrieb am Mo., 18. Nov. 2019, > 21:34: > > > Thanks Alfredo, if you can create a GitHub issue with notes/steps we can > > add this to the todo list for integrating with the MXNet CI to test on > m5a > > instances too. Then we can start tracking this on a regular basis. It > would > > be great to actually test on ARM instances now that AWS has A1 instances > > too…..ill add it to the wish list ;-D > > > > Sam > > > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque .INVALID> > > wrote: > > > > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > > > generation AMD Threadripper Gen 1 if someone has something easy to run > > and > > > representative. > > > > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > > > sska...@amazon.com.invalid) wrote: > > > > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or > is > > > there someone else in the mxnet dev@ community who can help? > > > > > > Sam > > > > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > > > wrote: > > >> > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > > > would > > >> definitely make sense as a requirement. It seems odd to classify that > as > > > a > > >> “nonstandard” use case. > > >> > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > > >> sska...@amazon.com.invalid) wrote: > > >> > > >> Thanks Patric & team for your work over the years to make MXNet fast > > with > > >> MKLDNN! > > >> > > >> I think it would be great to make MKLDNN enabled by default. We will > > need > > >> to continue producing variants without MKLDNN for those who don’t want > > it > > >> (Marco enumerated some use cases). How do you propose to identify the > > pip > > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > > > mxnet-cu101mkl > > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what > do > > > you > > >> propose we call the build without MKLDNN? mxnet-nomkl? > > >> > > >> Thanks! > > >> Sam > > >> > > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu < > marco.g.ab...@gmail.com> > > >> wrote: > > >>> > > >>> Hi Patric, > > >>> > > >>> First of all, thanks a lot to you and your team for all the effort on > > >> MXNet > > >>> and mkldnn! > > >>> > > >>> Generally I'm inclined towards your proposal, but I'm thinking about > > the > > >>> non-standard use cases: > > >>> - AMD CPU > > >>> - ARM CPU > > >>> - Windows > > >>> - GPU and MKLDNN enabled > > >>> - Fully reproducible results (medical and financial sector requested > > > that > > >>> and we have some flags for cuda) > > >>> > > >>> Is mkldnn fully compatible with these use cases? If not, what would > > >> happen? > > >>> If yes, do we have performance numbers? > > >>> > > >>> Best regards, > > >>> Marco > > >>> > > >>> Zhao, Patric schrieb am Mo., 18. Nov. 2019, > > >> 14:00: > > >>> > > Hi MXNet community, >
Re: Proposal to make MKLDNN as default CPU backend
Regarding the cases listed by Marco: - AMD CPU >From my architecture knowledge, what works on C4 instances (with AVX2 support) should also work well on m5a, right? I think mxnet-mkl and mxnet-cuxxmkl packages have been fully validated on AVX2 machines. Also, we didn't perform any validation on AMD CPU before, why we need do that for this time? - ARM CPU I don't know we're releasing any convenience binaries for ARM CPU. This proposal mainly targets those pypi packages. - Windows Already validated by CI. We're also releasing mxnet-mkl packages for Win. - GPU and MKLDNN enabled Already validated by CI and mxnet-cuxxmkl packages have been released for several versions. - Fully reproducible results (medical and financial sector requested that and we have some flags for cuda) Not sure I understand this case. We already have MKL-DNN backend for a while. Functionality and correctness of it have been verified by MXNet users. -tao On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu wrote: > Sorry, my intent with the "non-standard" phrase was not about general MXNet > but rather from MKLDNNs point of view, considering that it's being > developed by Intel, I assumed that MKLDNN might consider non-intel > use-cases non standard. > > -Marco > > Skalicky, Sam schrieb am Mo., 18. Nov. 2019, > 21:34: > > > Thanks Alfredo, if you can create a GitHub issue with notes/steps we can > > add this to the todo list for integrating with the MXNet CI to test on > m5a > > instances too. Then we can start tracking this on a regular basis. It > would > > be great to actually test on ARM instances now that AWS has A1 instances > > too…..ill add it to the wish list ;-D > > > > Sam > > > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque .INVALID> > > wrote: > > > > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > > > generation AMD Threadripper Gen 1 if someone has something easy to run > > and > > > representative. > > > > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > > > sska...@amazon.com.invalid) wrote: > > > > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or > is > > > there someone else in the mxnet dev@ community who can help? > > > > > > Sam > > > > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > > > wrote: > > >> > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > > > would > > >> definitely make sense as a requirement. It seems odd to classify that > as > > > a > > >> “nonstandard” use case. > > >> > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > > >> sska...@amazon.com.invalid) wrote: > > >> > > >> Thanks Patric & team for your work over the years to make MXNet fast > > with > > >> MKLDNN! > > >> > > >> I think it would be great to make MKLDNN enabled by default. We will > > need > > >> to continue producing variants without MKLDNN for those who don’t want > > it > > >> (Marco enumerated some use cases). How do you propose to identify the > > pip > > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > > > mxnet-cu101mkl > > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what > do > > > you > > >> propose we call the build without MKLDNN? mxnet-nomkl? > > >> > > >> Thanks! > > >> Sam > > >> > > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu < > marco.g.ab...@gmail.com> > > >> wrote: > > >>> > > >>> Hi Patric, > > >>> > > >>> First of all, thanks a lot to you and your team for all the effort on > > >> MXNet > > >>> and mkldnn! > > >>> > > >>> Generally I'm inclined towards your proposal, but I'm thinking about > > the > > >>> non-standard use cases: > > >>> - AMD CPU > > >>> - ARM CPU > > >>> - Windows > > >>> - GPU and MKLDNN enabled > > >>> - Fully reproducible results (medical and financial sector requested > > > that > > >>> and we have some flags for cuda) > > >>> > > >>> Is mkldnn fully compatible with these use cases? If not, what would > > >> happen? > > >>> If yes, do we have performance numbers? > > >>> > > >>> Best regards, > > >>> Marco > > >>> > > >>> Zhao, Patric schrieb am Mo., 18. Nov. 2019, > > >> 14:00: > > >>> > > Hi MXNet community, > > > > From the first MKLDNN backend integrated in release 1.2, the > community > > >> is > > continuously improving the quality and performance of MKLDNN CPU > > >> backend. > > Nowadays, the MKLDNN backend is widely used for the inference, > > >> especially > > for INT8 inference, and we got lots of very positive feedbacks from > > >> MXNet > > users. > > > > Achieved milestones as below: > > > > - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 > [1] > > - MKLDNN backend as default CPU backend from source building, Jan, > > 2019 > > >> [2] > > - MKLDNN subgraph optimization as default for the inference, Jul, > 2019 > > >> [3] > > - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] > > > > To make more successful and technical
Re: Proposal to make MKLDNN as default CPU backend
Sorry, my intent with the "non-standard" phrase was not about general MXNet but rather from MKLDNNs point of view, considering that it's being developed by Intel, I assumed that MKLDNN might consider non-intel use-cases non standard. -Marco Skalicky, Sam schrieb am Mo., 18. Nov. 2019, 21:34: > Thanks Alfredo, if you can create a GitHub issue with notes/steps we can > add this to the todo list for integrating with the MXNet CI to test on m5a > instances too. Then we can start tracking this on a regular basis. It would > be great to actually test on ARM instances now that AWS has A1 instances > too…..ill add it to the wish list ;-D > > Sam > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque > > > wrote: > > > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > > generation AMD Threadripper Gen 1 if someone has something easy to run > and > > representative. > > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > > sska...@amazon.com.invalid) wrote: > > > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or is > > there someone else in the mxnet dev@ community who can help? > > > > Sam > > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > > wrote: > >> > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > > would > >> definitely make sense as a requirement. It seems odd to classify that as > > a > >> “nonstandard” use case. > >> > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > >> sska...@amazon.com.invalid) wrote: > >> > >> Thanks Patric & team for your work over the years to make MXNet fast > with > >> MKLDNN! > >> > >> I think it would be great to make MKLDNN enabled by default. We will > need > >> to continue producing variants without MKLDNN for those who don’t want > it > >> (Marco enumerated some use cases). How do you propose to identify the > pip > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > > mxnet-cu101mkl > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do > > you > >> propose we call the build without MKLDNN? mxnet-nomkl? > >> > >> Thanks! > >> Sam > >> > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu > >> wrote: > >>> > >>> Hi Patric, > >>> > >>> First of all, thanks a lot to you and your team for all the effort on > >> MXNet > >>> and mkldnn! > >>> > >>> Generally I'm inclined towards your proposal, but I'm thinking about > the > >>> non-standard use cases: > >>> - AMD CPU > >>> - ARM CPU > >>> - Windows > >>> - GPU and MKLDNN enabled > >>> - Fully reproducible results (medical and financial sector requested > > that > >>> and we have some flags for cuda) > >>> > >>> Is mkldnn fully compatible with these use cases? If not, what would > >> happen? > >>> If yes, do we have performance numbers? > >>> > >>> Best regards, > >>> Marco > >>> > >>> Zhao, Patric schrieb am Mo., 18. Nov. 2019, > >> 14:00: > >>> > Hi MXNet community, > > From the first MKLDNN backend integrated in release 1.2, the community > >> is > continuously improving the quality and performance of MKLDNN CPU > >> backend. > Nowadays, the MKLDNN backend is widely used for the inference, > >> especially > for INT8 inference, and we got lots of very positive feedbacks from > >> MXNet > users. > > Achieved milestones as below: > > - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] > - MKLDNN backend as default CPU backend from source building, Jan, > 2019 > >> [2] > - MKLDNN subgraph optimization as default for the inference, Jul, 2019 > >> [3] > - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] > > To make more successful and technical leadership for Apache MXNet in > > the > industry, I propose to make MKLDNN as default CPU backend in all > binary > distribution from the next release. > The new milestone includes: > > - Static link MKLDNN library in the binary avoiding the mismatch > > version > in the runtime [5] > - Make nightly build with MKLDNN default from master pre 1.7 release > - Binary distribution with MKLDNN default from 1.7 release. > > What will be changed: > > - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 > - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release > (1.x) and plan to remove in next major release (2.0) > > Suggestions and comments are highly appreciated. > > Thanks, > > --Patric > > > [1] https://github.com/apache/incubator-mxnet/pull/9677 > [2] > > >> > > > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E > [3] https://github.com/apache/incubator-mxnet/pull/15518 > [4] > > >> > > > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E > [5]
Re: Proposal to make MKLDNN as default CPU backend
Thanks Alfredo, if you can create a GitHub issue with notes/steps we can add this to the todo list for integrating with the MXNet CI to test on m5a instances too. Then we can start tracking this on a regular basis. It would be great to actually test on ARM instances now that AWS has A1 instances too…..ill add it to the wish list ;-D Sam > On Nov 18, 2019, at 12:32 PM, Alfredo Luque > wrote: > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > generation AMD Threadripper Gen 1 if someone has something easy to run and > representative. > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > sska...@amazon.com.invalid) wrote: > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or is > there someone else in the mxnet dev@ community who can help? > > Sam > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > wrote: >> >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > would >> definitely make sense as a requirement. It seems odd to classify that as > a >> “nonstandard” use case. >> >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( >> sska...@amazon.com.invalid) wrote: >> >> Thanks Patric & team for your work over the years to make MXNet fast with >> MKLDNN! >> >> I think it would be great to make MKLDNN enabled by default. We will need >> to continue producing variants without MKLDNN for those who don’t want it >> (Marco enumerated some use cases). How do you propose to identify the pip >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > mxnet-cu101mkl >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do > you >> propose we call the build without MKLDNN? mxnet-nomkl? >> >> Thanks! >> Sam >> >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu >> wrote: >>> >>> Hi Patric, >>> >>> First of all, thanks a lot to you and your team for all the effort on >> MXNet >>> and mkldnn! >>> >>> Generally I'm inclined towards your proposal, but I'm thinking about the >>> non-standard use cases: >>> - AMD CPU >>> - ARM CPU >>> - Windows >>> - GPU and MKLDNN enabled >>> - Fully reproducible results (medical and financial sector requested > that >>> and we have some flags for cuda) >>> >>> Is mkldnn fully compatible with these use cases? If not, what would >> happen? >>> If yes, do we have performance numbers? >>> >>> Best regards, >>> Marco >>> >>> Zhao, Patric schrieb am Mo., 18. Nov. 2019, >> 14:00: >>> Hi MXNet community, From the first MKLDNN backend integrated in release 1.2, the community >> is continuously improving the quality and performance of MKLDNN CPU >> backend. Nowadays, the MKLDNN backend is widely used for the inference, >> especially for INT8 inference, and we got lots of very positive feedbacks from >> MXNet users. Achieved milestones as below: - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] - MKLDNN backend as default CPU backend from source building, Jan, 2019 >> [2] - MKLDNN subgraph optimization as default for the inference, Jul, 2019 >> [3] - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] To make more successful and technical leadership for Apache MXNet in > the industry, I propose to make MKLDNN as default CPU backend in all binary distribution from the next release. The new milestone includes: - Static link MKLDNN library in the binary avoiding the mismatch > version in the runtime [5] - Make nightly build with MKLDNN default from master pre 1.7 release - Binary distribution with MKLDNN default from 1.7 release. What will be changed: - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release (1.x) and plan to remove in next major release (2.0) Suggestions and comments are highly appreciated. Thanks, --Patric [1] https://github.com/apache/incubator-mxnet/pull/9677 [2] >> > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E [3] https://github.com/apache/incubator-mxnet/pull/15518 [4] >> > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E [5] https://github.com/apache/incubator-mxnet/pull/16731 >> >> — >> Alfredo Luque >> Software Engineer >> Machine Learning Infrastructure >> Airbnb >> San Francisco, CA > > — > Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA
Re: Proposal to make MKLDNN as default CPU backend
Happy to run some benchmarks on an AWS m5a instance (Epyc) and first generation AMD Threadripper Gen 1 if someone has something easy to run and representative. On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( sska...@amazon.com.invalid) wrote: Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or is there someone else in the mxnet dev@ community who can help? Sam > On Nov 18, 2019, at 12:27 PM, Alfredo Luque wrote: > > Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) would > definitely make sense as a requirement. It seems odd to classify that as a > “nonstandard” use case. > > On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > sska...@amazon.com.invalid) wrote: > > Thanks Patric & team for your work over the years to make MXNet fast with > MKLDNN! > > I think it would be great to make MKLDNN enabled by default. We will need > to continue producing variants without MKLDNN for those who don’t want it > (Marco enumerated some use cases). How do you propose to identify the pip > wheels with/without MKLDNN? Previously we had: mxnet-mkl and mxnet-cu101mkl > with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do you > propose we call the build without MKLDNN? mxnet-nomkl? > > Thanks! > Sam > >> On Nov 18, 2019, at 11:08 AM, Marco de Abreu > wrote: >> >> Hi Patric, >> >> First of all, thanks a lot to you and your team for all the effort on > MXNet >> and mkldnn! >> >> Generally I'm inclined towards your proposal, but I'm thinking about the >> non-standard use cases: >> - AMD CPU >> - ARM CPU >> - Windows >> - GPU and MKLDNN enabled >> - Fully reproducible results (medical and financial sector requested that >> and we have some flags for cuda) >> >> Is mkldnn fully compatible with these use cases? If not, what would > happen? >> If yes, do we have performance numbers? >> >> Best regards, >> Marco >> >> Zhao, Patric schrieb am Mo., 18. Nov. 2019, > 14:00: >> >>> Hi MXNet community, >>> >>> From the first MKLDNN backend integrated in release 1.2, the community > is >>> continuously improving the quality and performance of MKLDNN CPU > backend. >>> Nowadays, the MKLDNN backend is widely used for the inference, > especially >>> for INT8 inference, and we got lots of very positive feedbacks from > MXNet >>> users. >>> >>> Achieved milestones as below: >>> >>> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] >>> - MKLDNN backend as default CPU backend from source building, Jan, 2019 > [2] >>> - MKLDNN subgraph optimization as default for the inference, Jul, 2019 > [3] >>> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] >>> >>> To make more successful and technical leadership for Apache MXNet in the >>> industry, I propose to make MKLDNN as default CPU backend in all binary >>> distribution from the next release. >>> The new milestone includes: >>> >>> - Static link MKLDNN library in the binary avoiding the mismatch version >>> in the runtime [5] >>> - Make nightly build with MKLDNN default from master pre 1.7 release >>> - Binary distribution with MKLDNN default from 1.7 release. >>> >>> What will be changed: >>> >>> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 >>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release >>> (1.x) and plan to remove in next major release (2.0) >>> >>> Suggestions and comments are highly appreciated. >>> >>> Thanks, >>> >>> --Patric >>> >>> >>> [1] https://github.com/apache/incubator-mxnet/pull/9677 >>> [2] >>> > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E >>> [3] https://github.com/apache/incubator-mxnet/pull/15518 >>> [4] >>> > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E >>> [5] https://github.com/apache/incubator-mxnet/pull/16731 >>> > > — > Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA — Alfredo Luque Software Engineer Machine Learning Infrastructure Airbnb San Francisco, CA
Re: Proposal to make MKLDNN as default CPU backend
Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or is there someone else in the mxnet dev@ community who can help? Sam > On Nov 18, 2019, at 12:27 PM, Alfredo Luque > wrote: > > Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) would > definitely make sense as a requirement. It seems odd to classify that as a > “nonstandard” use case. > > On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > sska...@amazon.com.invalid) wrote: > > Thanks Patric & team for your work over the years to make MXNet fast with > MKLDNN! > > I think it would be great to make MKLDNN enabled by default. We will need > to continue producing variants without MKLDNN for those who don’t want it > (Marco enumerated some use cases). How do you propose to identify the pip > wheels with/without MKLDNN? Previously we had: mxnet-mkl and mxnet-cu101mkl > with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do you > propose we call the build without MKLDNN? mxnet-nomkl? > > Thanks! > Sam > >> On Nov 18, 2019, at 11:08 AM, Marco de Abreu > wrote: >> >> Hi Patric, >> >> First of all, thanks a lot to you and your team for all the effort on > MXNet >> and mkldnn! >> >> Generally I'm inclined towards your proposal, but I'm thinking about the >> non-standard use cases: >> - AMD CPU >> - ARM CPU >> - Windows >> - GPU and MKLDNN enabled >> - Fully reproducible results (medical and financial sector requested that >> and we have some flags for cuda) >> >> Is mkldnn fully compatible with these use cases? If not, what would > happen? >> If yes, do we have performance numbers? >> >> Best regards, >> Marco >> >> Zhao, Patric schrieb am Mo., 18. Nov. 2019, > 14:00: >> >>> Hi MXNet community, >>> >>> From the first MKLDNN backend integrated in release 1.2, the community > is >>> continuously improving the quality and performance of MKLDNN CPU > backend. >>> Nowadays, the MKLDNN backend is widely used for the inference, > especially >>> for INT8 inference, and we got lots of very positive feedbacks from > MXNet >>> users. >>> >>> Achieved milestones as below: >>> >>> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] >>> - MKLDNN backend as default CPU backend from source building, Jan, 2019 > [2] >>> - MKLDNN subgraph optimization as default for the inference, Jul, 2019 > [3] >>> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] >>> >>> To make more successful and technical leadership for Apache MXNet in the >>> industry, I propose to make MKLDNN as default CPU backend in all binary >>> distribution from the next release. >>> The new milestone includes: >>> >>> - Static link MKLDNN library in the binary avoiding the mismatch version >>> in the runtime [5] >>> - Make nightly build with MKLDNN default from master pre 1.7 release >>> - Binary distribution with MKLDNN default from 1.7 release. >>> >>> What will be changed: >>> >>> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 >>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release >>> (1.x) and plan to remove in next major release (2.0) >>> >>> Suggestions and comments are highly appreciated. >>> >>> Thanks, >>> >>> --Patric >>> >>> >>> [1] https://github.com/apache/incubator-mxnet/pull/9677 >>> [2] >>> > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E >>> [3] https://github.com/apache/incubator-mxnet/pull/15518 >>> [4] >>> > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E >>> [5] https://github.com/apache/incubator-mxnet/pull/16731 >>> > > — > Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA
Re: Proposal to make MKLDNN as default CPU backend
Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) would definitely make sense as a requirement. It seems odd to classify that as a “nonstandard” use case. On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( sska...@amazon.com.invalid) wrote: Thanks Patric & team for your work over the years to make MXNet fast with MKLDNN! I think it would be great to make MKLDNN enabled by default. We will need to continue producing variants without MKLDNN for those who don’t want it (Marco enumerated some use cases). How do you propose to identify the pip wheels with/without MKLDNN? Previously we had: mxnet-mkl and mxnet-cu101mkl with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do you propose we call the build without MKLDNN? mxnet-nomkl? Thanks! Sam > On Nov 18, 2019, at 11:08 AM, Marco de Abreu wrote: > > Hi Patric, > > First of all, thanks a lot to you and your team for all the effort on MXNet > and mkldnn! > > Generally I'm inclined towards your proposal, but I'm thinking about the > non-standard use cases: > - AMD CPU > - ARM CPU > - Windows > - GPU and MKLDNN enabled > - Fully reproducible results (medical and financial sector requested that > and we have some flags for cuda) > > Is mkldnn fully compatible with these use cases? If not, what would happen? > If yes, do we have performance numbers? > > Best regards, > Marco > > Zhao, Patric schrieb am Mo., 18. Nov. 2019, 14:00: > >> Hi MXNet community, >> >> From the first MKLDNN backend integrated in release 1.2, the community is >> continuously improving the quality and performance of MKLDNN CPU backend. >> Nowadays, the MKLDNN backend is widely used for the inference, especially >> for INT8 inference, and we got lots of very positive feedbacks from MXNet >> users. >> >> Achieved milestones as below: >> >> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] >> - MKLDNN backend as default CPU backend from source building, Jan, 2019 [2] >> - MKLDNN subgraph optimization as default for the inference, Jul, 2019 [3] >> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] >> >> To make more successful and technical leadership for Apache MXNet in the >> industry, I propose to make MKLDNN as default CPU backend in all binary >> distribution from the next release. >> The new milestone includes: >> >> - Static link MKLDNN library in the binary avoiding the mismatch version >> in the runtime [5] >> - Make nightly build with MKLDNN default from master pre 1.7 release >> - Binary distribution with MKLDNN default from 1.7 release. >> >> What will be changed: >> >> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 >> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release >> (1.x) and plan to remove in next major release (2.0) >> >> Suggestions and comments are highly appreciated. >> >> Thanks, >> >> --Patric >> >> >> [1] https://github.com/apache/incubator-mxnet/pull/9677 >> [2] >> https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E >> [3] https://github.com/apache/incubator-mxnet/pull/15518 >> [4] >> https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E >> [5] https://github.com/apache/incubator-mxnet/pull/16731 >> — Alfredo Luque Software Engineer Machine Learning Infrastructure Airbnb San Francisco, CA
Re: Proposal to make MKLDNN as default CPU backend
Thanks Patric & team for your work over the years to make MXNet fast with MKLDNN! I think it would be great to make MKLDNN enabled by default. We will need to continue producing variants without MKLDNN for those who don’t want it (Marco enumerated some use cases). How do you propose to identify the pip wheels with/without MKLDNN? Previously we had: mxnet-mkl and mxnet-cu101mkl with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do you propose we call the build without MKLDNN? mxnet-nomkl? Thanks! Sam > On Nov 18, 2019, at 11:08 AM, Marco de Abreu wrote: > > Hi Patric, > > First of all, thanks a lot to you and your team for all the effort on MXNet > and mkldnn! > > Generally I'm inclined towards your proposal, but I'm thinking about the > non-standard use cases: > - AMD CPU > - ARM CPU > - Windows > - GPU and MKLDNN enabled > - Fully reproducible results (medical and financial sector requested that > and we have some flags for cuda) > > Is mkldnn fully compatible with these use cases? If not, what would happen? > If yes, do we have performance numbers? > > Best regards, > Marco > > Zhao, Patric schrieb am Mo., 18. Nov. 2019, 14:00: > >> Hi MXNet community, >> >> From the first MKLDNN backend integrated in release 1.2, the community is >> continuously improving the quality and performance of MKLDNN CPU backend. >> Nowadays, the MKLDNN backend is widely used for the inference, especially >> for INT8 inference, and we got lots of very positive feedbacks from MXNet >> users. >> >> Achieved milestones as below: >> >> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] >> - MKLDNN backend as default CPU backend from source building, Jan, 2019 [2] >> - MKLDNN subgraph optimization as default for the inference, Jul, 2019 [3] >> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] >> >> To make more successful and technical leadership for Apache MXNet in the >> industry, I propose to make MKLDNN as default CPU backend in all binary >> distribution from the next release. >> The new milestone includes: >> >> - Static link MKLDNN library in the binary avoiding the mismatch version >> in the runtime [5] >> - Make nightly build with MKLDNN default from master pre 1.7 release >> - Binary distribution with MKLDNN default from 1.7 release. >> >> What will be changed: >> >> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 >> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release >> (1.x) and plan to remove in next major release (2.0) >> >> Suggestions and comments are highly appreciated. >> >> Thanks, >> >> --Patric >> >> >> [1] https://github.com/apache/incubator-mxnet/pull/9677 >> [2] >> https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E >> [3] https://github.com/apache/incubator-mxnet/pull/15518 >> [4] >> https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E >> [5] https://github.com/apache/incubator-mxnet/pull/16731 >>
Re: Proposal to make MKLDNN as default CPU backend
Hi Patric, First of all, thanks a lot to you and your team for all the effort on MXNet and mkldnn! Generally I'm inclined towards your proposal, but I'm thinking about the non-standard use cases: - AMD CPU - ARM CPU - Windows - GPU and MKLDNN enabled - Fully reproducible results (medical and financial sector requested that and we have some flags for cuda) Is mkldnn fully compatible with these use cases? If not, what would happen? If yes, do we have performance numbers? Best regards, Marco Zhao, Patric schrieb am Mo., 18. Nov. 2019, 14:00: > Hi MXNet community, > > From the first MKLDNN backend integrated in release 1.2, the community is > continuously improving the quality and performance of MKLDNN CPU backend. > Nowadays, the MKLDNN backend is widely used for the inference, especially > for INT8 inference, and we got lots of very positive feedbacks from MXNet > users. > > Achieved milestones as below: > > - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] > - MKLDNN backend as default CPU backend from source building, Jan, 2019 [2] > - MKLDNN subgraph optimization as default for the inference, Jul, 2019 [3] > - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] > > To make more successful and technical leadership for Apache MXNet in the > industry, I propose to make MKLDNN as default CPU backend in all binary > distribution from the next release. > The new milestone includes: > > - Static link MKLDNN library in the binary avoiding the mismatch version > in the runtime [5] > - Make nightly build with MKLDNN default from master pre 1.7 release > - Binary distribution with MKLDNN default from 1.7 release. > > What will be changed: > > - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 > - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release > (1.x) and plan to remove in next major release (2.0) > > Suggestions and comments are highly appreciated. > > Thanks, > > --Patric > > > [1] https://github.com/apache/incubator-mxnet/pull/9677 > [2] > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E > [3] https://github.com/apache/incubator-mxnet/pull/15518 > [4] > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E > [5] https://github.com/apache/incubator-mxnet/pull/16731 >