Agee with Tianqi and Hao. Adding AMD brings no value and increases complexity and CI cost. The instructions sets are the same. For benchmarking it might make sense though.
Pedro > On 30. Nov 2018, at 18:19, Tianqi Chen <tqc...@cs.washington.edu> wrote: > > I still think it is overkill to add AMD CPU to the CI, given the additional > cost it could bring and little additional information we can get out from > it. > > A middle group is to add AMD CPU to a nightly build or final sweep before > release. If there is a case that we find that AMD CPU really makes a > difference, then we add it to the CI > > Tianqi > >> On Thu, Nov 29, 2018 at 6:29 PM Hao Jin <hjjn.a...@gmail.com> wrote: >> >> For CPUs, the supported instruction sets may also vary between the same >> manufacturer's different product lines of the same generation (Skylake-SP >> versus Skylake). >> For the same instruction set, the two manufacturers should both have a >> working version of the hardware implementation. If any of the >> implementations does not work, then the chip would not even be considered >> functioning properly. >> If some AMD CPUs only support up to AVX2 instruction sets, they would just >> function in the same way as an Intel CPU that supports up to AVX2 >> instruction sets. The performance may vary, but the capability and behavior >> of the two chips would be the same when given the same machine code. >> For AMD GPUs it's a totally different story, as AMD GPUs do not share the >> same instruction sets with the NVIDIA ones, thus testing on AMD GPUs(if we >> do have support for them) would definitely add values. >> Hao >> >> On Thu, Nov 29, 2018 at 8:37 PM Anirudh Subramanian <anirudh2...@gmail.com >>> >> wrote: >> >>> Instruction set extensions support like AVX2, AVX512 etc. can vary >> between >>> AMD and Intel and there can also be a time lag between when Intel >> supports >>> it versus when AMD supports it. >>> Also, in the future this setup may be useful in case MXNet supports AMD >>> GPUs and AWS also happens to have support for it. >>> >>> Anirudh >>> >>> >>> On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu >>> <marco.g.ab...@googlemail.com.invalid> wrote: >>> >>>> I think it's worth a discussion to do a sanity check. While generally >>> these >>>> instructions are standardized, we also made the experience with ARM >> that >>>> the theory and reality sometimes don't match. Thus, it's always good to >>>> check. >>>> >>>> In the next months we are going to refactor our slave creation >> processes. >>>> Chance Bair has been working on rewriting Windows slaves from scratch >> (we >>>> used images that haven't really been updated for 2 years - we still >> don't >>>> know what was done on them) and they're ready soon. In the following >>>> months, we will also port our Ubuntu slaves to the new method (don't >>> have a >>>> timeline yet). Ideally, the integration of AMD instances will only be a >>>> matter of running the same pipeline on a different instance type. In >> that >>>> Case, it should not be a big deal. >>>> >>>> If there are big differences, that's already a yellow flag for >>>> compatibility, but that's unlikely. But in that case, we would have to >>> make >>>> a more thorough time analysis and whether it's worth the effort. Maybe, >>>> somebody else could also lend us a hand and help us with adding AMD >>>> support. >>>> >>>> -Marco >>>> >>>> Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <hjjn.a...@gmail.com> >>>> geschrieben: >>>> >>>>> f16c is also an instruction set supported by both brands' recent CPUs >>>> just >>>>> like x86, AVX, SSE etc., and any difference in behaviors (quite >>>> impossible >>>>> to happen or it will be a major defect) would most likely be caused >> by >>>> the >>>>> underlying hardware implementation, so still, adding AMD instances is >>> not >>>>> adding much value here. >>>>> Hao >>>>> >>>>> On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland < >>>>> kellen.sunderl...@gmail.com> wrote: >>>>> >>>>>> Just looked at the mf16c work and wanted to mention Rahul clearly >>> _was_ >>>>>> thinking about AMD users in that PR. >>>>>> >>>>>> On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland < >>>>>> kellen.sunderl...@gmail.com> wrote: >>>>>> >>>>>>> From my perspective we're developing a few features like mf16c >> and >>>>> MKLDNN >>>>>>> integration specifically for Intel CPUs. It wouldn't hurt to >> make >>>> sure >>>>>>> those changes also run properly on AMD cpus. >>>>>>> >>>>>>> On Thu, Nov 29, 2018, 3:38 PM Hao Jin <hjjn.a...@gmail.com >> wrote: >>>>>>> >>>>>>>> I'm a bit confused about why we need extra functionality tests >>> just >>>>> for >>>>>>>> AMD >>>>>>>> CPUs, aren't AMD CPUs supporting roughly the same instruction >> sets >>>> as >>>>>> the >>>>>>>> Intel ones? In the very impossible case that something working >> on >>>>> Intel >>>>>>>> CPUs being not functioning on AMD CPUs (or vice versa), it would >>>>> mostly >>>>>>>> likely be related to the underlying hardware implementation of >> the >>>>> same >>>>>>>> ISA, to which we definitely do not have a good solution. So I >>> don't >>>>>> think >>>>>>>> performing extra tests on functional aspect of the system on AMD >>>> CPUs >>>>> is >>>>>>>> adding any values. >>>>>>>> Hao >>>>>>>> >>>>>>>> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu >>>> <seth...@amazon.com.invalid >>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> On 11/29/18, 2:39 PM, "Alex Zai" <aza...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> What are people's thoughts on having AMD machines tested >> on >>>> the >>>>>> CI? >>>>>>>> AMD >>>>>>>>> machines are now available on AWS. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Alex >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>