I think just Adding AMD is not the right abstraction level. Testing and benchmarking with different cpu flags / march ie AVX2 sse2 brings value in my opinion. Just testing another vendor of a compatible cpu doesn’t.
Pedro > On 30. Nov 2018, at 19:32, kellen sunderland <[email protected]> > wrote: > > Damn, knew i should have double-checked! Oh well it's also carbon neutral. > > On Fri, Nov 30, 2018 at 10:27 AM Pedro Larroy <[email protected]> > wrote: > >> Agee with Tianqi and Hao. Adding AMD brings no value and increases >> complexity and CI cost. The instructions sets are the same. For >> benchmarking it might make sense though. >> >> Pedro >> >>> On 30. Nov 2018, at 18:19, Tianqi Chen <[email protected]> wrote: >>> >>> I still think it is overkill to add AMD CPU to the CI, given the >> additional >>> cost it could bring and little additional information we can get out from >>> it. >>> >>> A middle group is to add AMD CPU to a nightly build or final sweep before >>> release. If there is a case that we find that AMD CPU really makes a >>> difference, then we add it to the CI >>> >>> Tianqi >>> >>>> On Thu, Nov 29, 2018 at 6:29 PM Hao Jin <[email protected]> wrote: >>>> >>>> For CPUs, the supported instruction sets may also vary between the same >>>> manufacturer's different product lines of the same generation >> (Skylake-SP >>>> versus Skylake). >>>> For the same instruction set, the two manufacturers should both have a >>>> working version of the hardware implementation. If any of the >>>> implementations does not work, then the chip would not even be >> considered >>>> functioning properly. >>>> If some AMD CPUs only support up to AVX2 instruction sets, they would >> just >>>> function in the same way as an Intel CPU that supports up to AVX2 >>>> instruction sets. The performance may vary, but the capability and >> behavior >>>> of the two chips would be the same when given the same machine code. >>>> For AMD GPUs it's a totally different story, as AMD GPUs do not share >> the >>>> same instruction sets with the NVIDIA ones, thus testing on AMD GPUs(if >> we >>>> do have support for them) would definitely add values. >>>> Hao >>>> >>>> On Thu, Nov 29, 2018 at 8:37 PM Anirudh Subramanian < >> [email protected] >>>>> >>>> wrote: >>>> >>>>> Instruction set extensions support like AVX2, AVX512 etc. can vary >>>> between >>>>> AMD and Intel and there can also be a time lag between when Intel >>>> supports >>>>> it versus when AMD supports it. >>>>> Also, in the future this setup may be useful in case MXNet supports AMD >>>>> GPUs and AWS also happens to have support for it. >>>>> >>>>> Anirudh >>>>> >>>>> >>>>> On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu >>>>> <[email protected]> wrote: >>>>> >>>>>> I think it's worth a discussion to do a sanity check. While generally >>>>> these >>>>>> instructions are standardized, we also made the experience with ARM >>>> that >>>>>> the theory and reality sometimes don't match. Thus, it's always good >> to >>>>>> check. >>>>>> >>>>>> In the next months we are going to refactor our slave creation >>>> processes. >>>>>> Chance Bair has been working on rewriting Windows slaves from scratch >>>> (we >>>>>> used images that haven't really been updated for 2 years - we still >>>> don't >>>>>> know what was done on them) and they're ready soon. In the following >>>>>> months, we will also port our Ubuntu slaves to the new method (don't >>>>> have a >>>>>> timeline yet). Ideally, the integration of AMD instances will only be >> a >>>>>> matter of running the same pipeline on a different instance type. In >>>> that >>>>>> Case, it should not be a big deal. >>>>>> >>>>>> If there are big differences, that's already a yellow flag for >>>>>> compatibility, but that's unlikely. But in that case, we would have to >>>>> make >>>>>> a more thorough time analysis and whether it's worth the effort. >> Maybe, >>>>>> somebody else could also lend us a hand and help us with adding AMD >>>>>> support. >>>>>> >>>>>> -Marco >>>>>> >>>>>> Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <[email protected]> >>>>>> geschrieben: >>>>>> >>>>>>> f16c is also an instruction set supported by both brands' recent CPUs >>>>>> just >>>>>>> like x86, AVX, SSE etc., and any difference in behaviors (quite >>>>>> impossible >>>>>>> to happen or it will be a major defect) would most likely be caused >>>> by >>>>>> the >>>>>>> underlying hardware implementation, so still, adding AMD instances is >>>>> not >>>>>>> adding much value here. >>>>>>> Hao >>>>>>> >>>>>>> On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Just looked at the mf16c work and wanted to mention Rahul clearly >>>>> _was_ >>>>>>>> thinking about AMD users in that PR. >>>>>>>> >>>>>>>> On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> From my perspective we're developing a few features like mf16c >>>> and >>>>>>> MKLDNN >>>>>>>>> integration specifically for Intel CPUs. It wouldn't hurt to >>>> make >>>>>> sure >>>>>>>>> those changes also run properly on AMD cpus. >>>>>>>>> >>>>>>>>> On Thu, Nov 29, 2018, 3:38 PM Hao Jin <[email protected] >>>> wrote: >>>>>>>>> >>>>>>>>>> I'm a bit confused about why we need extra functionality tests >>>>> just >>>>>>> for >>>>>>>>>> AMD >>>>>>>>>> CPUs, aren't AMD CPUs supporting roughly the same instruction >>>> sets >>>>>> as >>>>>>>> the >>>>>>>>>> Intel ones? In the very impossible case that something working >>>> on >>>>>>> Intel >>>>>>>>>> CPUs being not functioning on AMD CPUs (or vice versa), it would >>>>>>> mostly >>>>>>>>>> likely be related to the underlying hardware implementation of >>>> the >>>>>>> same >>>>>>>>>> ISA, to which we definitely do not have a good solution. So I >>>>> don't >>>>>>>> think >>>>>>>>>> performing extra tests on functional aspect of the system on AMD >>>>>> CPUs >>>>>>> is >>>>>>>>>> adding any values. >>>>>>>>>> Hao >>>>>>>>>> >>>>>>>>>> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu >>>>>> <[email protected] >>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> +1 >>>>>>>>>>> >>>>>>>>>>> On 11/29/18, 2:39 PM, "Alex Zai" <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> What are people's thoughts on having AMD machines tested >>>> on >>>>>> the >>>>>>>> CI? >>>>>>>>>> AMD >>>>>>>>>>> machines are now available on AWS. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Alex >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>
