I still think it is overkill to add AMD CPU to the CI, given the additional cost it could bring and little additional information we can get out from it.
A middle group is to add AMD CPU to a nightly build or final sweep before release. If there is a case that we find that AMD CPU really makes a difference, then we add it to the CI Tianqi On Thu, Nov 29, 2018 at 6:29 PM Hao Jin <[email protected]> wrote: > For CPUs, the supported instruction sets may also vary between the same > manufacturer's different product lines of the same generation (Skylake-SP > versus Skylake). > For the same instruction set, the two manufacturers should both have a > working version of the hardware implementation. If any of the > implementations does not work, then the chip would not even be considered > functioning properly. > If some AMD CPUs only support up to AVX2 instruction sets, they would just > function in the same way as an Intel CPU that supports up to AVX2 > instruction sets. The performance may vary, but the capability and behavior > of the two chips would be the same when given the same machine code. > For AMD GPUs it's a totally different story, as AMD GPUs do not share the > same instruction sets with the NVIDIA ones, thus testing on AMD GPUs(if we > do have support for them) would definitely add values. > Hao > > On Thu, Nov 29, 2018 at 8:37 PM Anirudh Subramanian <[email protected] > > > wrote: > > > Instruction set extensions support like AVX2, AVX512 etc. can vary > between > > AMD and Intel and there can also be a time lag between when Intel > supports > > it versus when AMD supports it. > > Also, in the future this setup may be useful in case MXNet supports AMD > > GPUs and AWS also happens to have support for it. > > > > Anirudh > > > > > > On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu > > <[email protected]> wrote: > > > > > I think it's worth a discussion to do a sanity check. While generally > > these > > > instructions are standardized, we also made the experience with ARM > that > > > the theory and reality sometimes don't match. Thus, it's always good to > > > check. > > > > > > In the next months we are going to refactor our slave creation > processes. > > > Chance Bair has been working on rewriting Windows slaves from scratch > (we > > > used images that haven't really been updated for 2 years - we still > don't > > > know what was done on them) and they're ready soon. In the following > > > months, we will also port our Ubuntu slaves to the new method (don't > > have a > > > timeline yet). Ideally, the integration of AMD instances will only be a > > > matter of running the same pipeline on a different instance type. In > that > > > Case, it should not be a big deal. > > > > > > If there are big differences, that's already a yellow flag for > > > compatibility, but that's unlikely. But in that case, we would have to > > make > > > a more thorough time analysis and whether it's worth the effort. Maybe, > > > somebody else could also lend us a hand and help us with adding AMD > > > support. > > > > > > -Marco > > > > > > Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <[email protected]> > > > geschrieben: > > > > > > > f16c is also an instruction set supported by both brands' recent CPUs > > > just > > > > like x86, AVX, SSE etc., and any difference in behaviors (quite > > > impossible > > > > to happen or it will be a major defect) would most likely be caused > by > > > the > > > > underlying hardware implementation, so still, adding AMD instances is > > not > > > > adding much value here. > > > > Hao > > > > > > > > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland < > > > > [email protected]> wrote: > > > > > > > > > Just looked at the mf16c work and wanted to mention Rahul clearly > > _was_ > > > > > thinking about AMD users in that PR. > > > > > > > > > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland < > > > > > [email protected]> wrote: > > > > > > > > > > > From my perspective we're developing a few features like mf16c > and > > > > MKLDNN > > > > > > integration specifically for Intel CPUs. It wouldn't hurt to > make > > > sure > > > > > > those changes also run properly on AMD cpus. > > > > > > > > > > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin <[email protected] > wrote: > > > > > > > > > > > >> I'm a bit confused about why we need extra functionality tests > > just > > > > for > > > > > >> AMD > > > > > >> CPUs, aren't AMD CPUs supporting roughly the same instruction > sets > > > as > > > > > the > > > > > >> Intel ones? In the very impossible case that something working > on > > > > Intel > > > > > >> CPUs being not functioning on AMD CPUs (or vice versa), it would > > > > mostly > > > > > >> likely be related to the underlying hardware implementation of > the > > > > same > > > > > >> ISA, to which we definitely do not have a good solution. So I > > don't > > > > > think > > > > > >> performing extra tests on functional aspect of the system on AMD > > > CPUs > > > > is > > > > > >> adding any values. > > > > > >> Hao > > > > > >> > > > > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu > > > <[email protected] > > > > > > > > > > >> wrote: > > > > > >> > > > > > >> > +1 > > > > > >> > > > > > > >> > On 11/29/18, 2:39 PM, "Alex Zai" <[email protected]> wrote: > > > > > >> > > > > > > >> > What are people's thoughts on having AMD machines tested > on > > > the > > > > > CI? > > > > > >> AMD > > > > > >> > machines are now available on AWS. > > > > > >> > > > > > > >> > Best, > > > > > >> > Alex > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > >
