+1 to nightly. Given the awesome results shown by Alex for AMD cpus I think MKLDNN actually would probably be something I'd use, even on my AMD machines. Kudos to Intel for releasing this lib which works great on their hardware, but still pretty well w/ AMD. The upshot of MKLDNN supporting AMD to me is that it makes me much more likely to support it as the default PyPi package (discussed in another thread). This is part of the reason I'd like to have a sanity test in CI somewhere for AMD hardware.
Unrelated note: regarding global warming I actually partially chose eu-west-1 to host CI because it's carbon neutral. The cost of the CI is significant, and although it's donated by AWS I'm glad the community is cognizant of that. On Fri, Nov 30, 2018 at 9:54 AM Kumar, Vikas <[email protected]> wrote: > I concur. +1 for nightly for pre-release suit. > > On 11/30/18, 9:49 AM, "Tianqi Chen" <[email protected]> wrote: > > +1 for nightly for pre-release suit, but not the CI that triggered in > every > test. The best engineering practice is not to add things, but to > remove > things so that there is nothing can be removed. > > In terms of MLDNN, since it is an Intel product, I doubt optimizing > for AMD > CPUs is its goal, adding CI to guard against backward compatibility is > a > bit overkill even. Since the AMD CPU user would likely disable this > feature > and use the original CPU version of the project. > > At least we can contribute to reducing the carbon footprint and slows > down > the global warming :) > > Tianqi > > On Fri, Nov 30, 2018 at 9:38 AM kellen sunderland < > [email protected]> wrote: > > > Regarding cost, yes we could run this nightly or simply make it run > an > > existing test suite that would make sense rather than having it > duplicate a > > suite. > > > > On Fri, Nov 30, 2018 at 9:26 AM Kumar, Vikas > <[email protected]> > > wrote: > > > > > I don't think there is any downside to this proposal. I think a > basic > > > sanity CI testing on AMD processors will give extra boost to our > tests. > > > This adds to developer productivity and they have one less thing > to worry > > > about. Developers have spent time in past where they had to > manually test > > > on AMD processors, MKLDNN being the recent instance. It's good to > have > > > those test in CI pipeline. > > > All I see is benefit. If the $ cost is not too high for basic > sanity > > > testing, we should do this, until and unless some strong downside > is > > called > > > out. > > > > > > +1 > > > > > > > > > On 11/29/18, 5:37 PM, "Anirudh Subramanian" <[email protected] > > > > > wrote: > > > > > > Instruction set extensions support like AVX2, AVX512 etc. can > vary > > > between > > > AMD and Intel and there can also be a time lag between when > Intel > > > supports > > > it versus when AMD supports it. > > > Also, in the future this setup may be useful in case MXNet > supports > > AMD > > > GPUs and AWS also happens to have support for it. > > > > > > Anirudh > > > > > > > > > On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu > > > <[email protected]> wrote: > > > > > > > I think it's worth a discussion to do a sanity check. While > > > generally these > > > > instructions are standardized, we also made the experience > with ARM > > > that > > > > the theory and reality sometimes don't match. Thus, it's > always > > good > > > to > > > > check. > > > > > > > > In the next months we are going to refactor our slave > creation > > > processes. > > > > Chance Bair has been working on rewriting Windows slaves from > > > scratch (we > > > > used images that haven't really been updated for 2 years - > we still > > > don't > > > > know what was done on them) and they're ready soon. In the > > following > > > > months, we will also port our Ubuntu slaves to the new method > > (don't > > > have a > > > > timeline yet). Ideally, the integration of AMD instances > will only > > > be a > > > > matter of running the same pipeline on a different instance > type. > > In > > > that > > > > Case, it should not be a big deal. > > > > > > > > If there are big differences, that's already a yellow flag > for > > > > compatibility, but that's unlikely. But in that case, we > would have > > > to make > > > > a more thorough time analysis and whether it's worth the > effort. > > > Maybe, > > > > somebody else could also lend us a hand and help us with > adding AMD > > > > support. > > > > > > > > -Marco > > > > > > > > Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin < > [email protected]> > > > > geschrieben: > > > > > > > > > f16c is also an instruction set supported by both brands' > recent > > > CPUs > > > > just > > > > > like x86, AVX, SSE etc., and any difference in behaviors > (quite > > > > impossible > > > > > to happen or it will be a major defect) would most likely > be > > > caused by > > > > the > > > > > underlying hardware implementation, so still, adding AMD > > instances > > > is not > > > > > adding much value here. > > > > > Hao > > > > > > > > > > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland < > > > > > [email protected]> wrote: > > > > > > > > > > > Just looked at the mf16c work and wanted to mention Rahul > > > clearly _was_ > > > > > > thinking about AMD users in that PR. > > > > > > > > > > > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland < > > > > > > [email protected]> wrote: > > > > > > > > > > > > > From my perspective we're developing a few features > like > > mf16c > > > and > > > > > MKLDNN > > > > > > > integration specifically for Intel CPUs. It wouldn't > hurt to > > > make > > > > sure > > > > > > > those changes also run properly on AMD cpus. > > > > > > > > > > > > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin < > [email protected] > > > wrote: > > > > > > > > > > > > > >> I'm a bit confused about why we need extra > functionality > > > tests just > > > > > for > > > > > > >> AMD > > > > > > >> CPUs, aren't AMD CPUs supporting roughly the same > > instruction > > > sets > > > > as > > > > > > the > > > > > > >> Intel ones? In the very impossible case that something > > > working on > > > > > Intel > > > > > > >> CPUs being not functioning on AMD CPUs (or vice > versa), it > > > would > > > > > mostly > > > > > > >> likely be related to the underlying hardware > implementation > > > of the > > > > > same > > > > > > >> ISA, to which we definitely do not have a good > solution. So > > I > > > don't > > > > > > think > > > > > > >> performing extra tests on functional aspect of the > system on > > > AMD > > > > CPUs > > > > > is > > > > > > >> adding any values. > > > > > > >> Hao > > > > > > >> > > > > > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu > > > > <[email protected] > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > >> > +1 > > > > > > >> > > > > > > > >> > On 11/29/18, 2:39 PM, "Alex Zai" <[email protected]> > > wrote: > > > > > > >> > > > > > > > >> > What are people's thoughts on having AMD > machines > > > tested on > > > > the > > > > > > CI? > > > > > > >> AMD > > > > > > >> > machines are now available on AWS. > > > > > > >> > > > > > > > >> > Best, > > > > > > >> > Alex > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
