+1 for nightly for pre-release suit, but not the CI that triggered in every test. The best engineering practice is not to add things, but to remove things so that there is nothing can be removed.
In terms of MLDNN, since it is an Intel product, I doubt optimizing for AMD CPUs is its goal, adding CI to guard against backward compatibility is a bit overkill even. Since the AMD CPU user would likely disable this feature and use the original CPU version of the project. At least we can contribute to reducing the carbon footprint and slows down the global warming :) Tianqi On Fri, Nov 30, 2018 at 9:38 AM kellen sunderland < [email protected]> wrote: > Regarding cost, yes we could run this nightly or simply make it run an > existing test suite that would make sense rather than having it duplicate a > suite. > > On Fri, Nov 30, 2018 at 9:26 AM Kumar, Vikas <[email protected]> > wrote: > > > I don't think there is any downside to this proposal. I think a basic > > sanity CI testing on AMD processors will give extra boost to our tests. > > This adds to developer productivity and they have one less thing to worry > > about. Developers have spent time in past where they had to manually test > > on AMD processors, MKLDNN being the recent instance. It's good to have > > those test in CI pipeline. > > All I see is benefit. If the $ cost is not too high for basic sanity > > testing, we should do this, until and unless some strong downside is > called > > out. > > > > +1 > > > > > > On 11/29/18, 5:37 PM, "Anirudh Subramanian" <[email protected]> > > wrote: > > > > Instruction set extensions support like AVX2, AVX512 etc. can vary > > between > > AMD and Intel and there can also be a time lag between when Intel > > supports > > it versus when AMD supports it. > > Also, in the future this setup may be useful in case MXNet supports > AMD > > GPUs and AWS also happens to have support for it. > > > > Anirudh > > > > > > On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu > > <[email protected]> wrote: > > > > > I think it's worth a discussion to do a sanity check. While > > generally these > > > instructions are standardized, we also made the experience with ARM > > that > > > the theory and reality sometimes don't match. Thus, it's always > good > > to > > > check. > > > > > > In the next months we are going to refactor our slave creation > > processes. > > > Chance Bair has been working on rewriting Windows slaves from > > scratch (we > > > used images that haven't really been updated for 2 years - we still > > don't > > > know what was done on them) and they're ready soon. In the > following > > > months, we will also port our Ubuntu slaves to the new method > (don't > > have a > > > timeline yet). Ideally, the integration of AMD instances will only > > be a > > > matter of running the same pipeline on a different instance type. > In > > that > > > Case, it should not be a big deal. > > > > > > If there are big differences, that's already a yellow flag for > > > compatibility, but that's unlikely. But in that case, we would have > > to make > > > a more thorough time analysis and whether it's worth the effort. > > Maybe, > > > somebody else could also lend us a hand and help us with adding AMD > > > support. > > > > > > -Marco > > > > > > Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <[email protected]> > > > geschrieben: > > > > > > > f16c is also an instruction set supported by both brands' recent > > CPUs > > > just > > > > like x86, AVX, SSE etc., and any difference in behaviors (quite > > > impossible > > > > to happen or it will be a major defect) would most likely be > > caused by > > > the > > > > underlying hardware implementation, so still, adding AMD > instances > > is not > > > > adding much value here. > > > > Hao > > > > > > > > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland < > > > > [email protected]> wrote: > > > > > > > > > Just looked at the mf16c work and wanted to mention Rahul > > clearly _was_ > > > > > thinking about AMD users in that PR. > > > > > > > > > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland < > > > > > [email protected]> wrote: > > > > > > > > > > > From my perspective we're developing a few features like > mf16c > > and > > > > MKLDNN > > > > > > integration specifically for Intel CPUs. It wouldn't hurt to > > make > > > sure > > > > > > those changes also run properly on AMD cpus. > > > > > > > > > > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin <[email protected] > > wrote: > > > > > > > > > > > >> I'm a bit confused about why we need extra functionality > > tests just > > > > for > > > > > >> AMD > > > > > >> CPUs, aren't AMD CPUs supporting roughly the same > instruction > > sets > > > as > > > > > the > > > > > >> Intel ones? In the very impossible case that something > > working on > > > > Intel > > > > > >> CPUs being not functioning on AMD CPUs (or vice versa), it > > would > > > > mostly > > > > > >> likely be related to the underlying hardware implementation > > of the > > > > same > > > > > >> ISA, to which we definitely do not have a good solution. So > I > > don't > > > > > think > > > > > >> performing extra tests on functional aspect of the system on > > AMD > > > CPUs > > > > is > > > > > >> adding any values. > > > > > >> Hao > > > > > >> > > > > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu > > > <[email protected] > > > > > > > > > > >> wrote: > > > > > >> > > > > > >> > +1 > > > > > >> > > > > > > >> > On 11/29/18, 2:39 PM, "Alex Zai" <[email protected]> > wrote: > > > > > >> > > > > > > >> > What are people's thoughts on having AMD machines > > tested on > > > the > > > > > CI? > > > > > >> AMD > > > > > >> > machines are now available on AWS. > > > > > >> > > > > > > >> > Best, > > > > > >> > Alex > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >
