Re: Adding AMD CPU to CI

Pedro Larroy Fri, 30 Nov 2018 10:28:37 -0800

Agee with Tianqi and Hao. Adding AMD brings no value and increases complexity 
and CI cost. The instructions sets are the same. For benchmarking it might make 
sense though.


Pedro

> On 30. Nov 2018, at 18:19, Tianqi Chen <[email protected]> wrote:
> 
> I still think it is overkill to add AMD CPU to the CI, given the additional
> cost it could bring and little additional information we can get out from
> it.
> 
> A middle group is to add AMD CPU to a nightly build or final sweep before
> release. If there is a case that we find that AMD CPU really makes a
> difference, then we add it to the CI
> 
> Tianqi
> 
>> On Thu, Nov 29, 2018 at 6:29 PM Hao Jin <[email protected]> wrote:
>> 
>> For CPUs, the supported instruction sets may also vary between the same
>> manufacturer's different product lines of the same generation (Skylake-SP
>> versus Skylake).
>> For the same instruction set, the two manufacturers should both have a
>> working version of the hardware implementation. If any of the
>> implementations does not work, then the chip would not even be considered
>> functioning properly.
>> If some AMD CPUs only support up to AVX2 instruction sets, they would just
>> function in the same way as an Intel CPU that supports up to AVX2
>> instruction sets. The performance may vary, but the capability and behavior
>> of the two chips would be the same when given the same machine code.
>> For AMD GPUs it's a totally different story, as AMD GPUs do not share the
>> same instruction sets with the NVIDIA ones, thus testing on AMD GPUs(if we
>> do have support for them) would definitely add values.
>> Hao
>> 
>> On Thu, Nov 29, 2018 at 8:37 PM Anirudh Subramanian <[email protected]
>>> 
>> wrote:
>> 
>>> Instruction set extensions support like AVX2, AVX512 etc. can vary
>> between
>>> AMD and Intel and there can also be a time lag between when Intel
>> supports
>>> it versus when AMD supports it.
>>> Also, in the future this setup may be useful in case MXNet supports AMD
>>> GPUs and AWS also happens to have support for it.
>>> 
>>> Anirudh
>>> 
>>> 
>>> On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu
>>> <[email protected]> wrote:
>>> 
>>>> I think it's worth a discussion to do a sanity check. While generally
>>> these
>>>> instructions are standardized, we also made the experience with ARM
>> that
>>>> the theory and reality sometimes don't match. Thus, it's always good to
>>>> check.
>>>> 
>>>> In the next months we are going to refactor our slave creation
>> processes.
>>>> Chance Bair has been working on rewriting Windows slaves from scratch
>> (we
>>>> used images that haven't really been updated for 2 years - we still
>> don't
>>>> know what was done on them) and they're ready soon. In the following
>>>> months, we will also port our Ubuntu slaves to the new method (don't
>>> have a
>>>> timeline yet). Ideally, the integration of AMD instances will only be a
>>>> matter of running the same pipeline on a different instance type. In
>> that
>>>> Case, it should not be a big deal.
>>>> 
>>>> If there are big differences, that's already a yellow flag for
>>>> compatibility, but that's unlikely. But in that case, we would have to
>>> make
>>>> a more thorough time analysis and whether it's worth the effort. Maybe,
>>>> somebody else could also lend us a hand and help us with adding AMD
>>>> support.
>>>> 
>>>> -Marco
>>>> 
>>>> Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <[email protected]>
>>>> geschrieben:
>>>> 
>>>>> f16c is also an instruction set supported by both brands' recent CPUs
>>>> just
>>>>> like x86, AVX, SSE etc., and any difference in behaviors (quite
>>>> impossible
>>>>> to happen or it will be a major defect) would most likely be caused
>> by
>>>> the
>>>>> underlying hardware implementation, so still, adding AMD instances is
>>> not
>>>>> adding much value here.
>>>>> Hao
>>>>> 
>>>>> On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland <
>>>>> [email protected]> wrote:
>>>>> 
>>>>>> Just looked at the mf16c work and wanted to mention Rahul clearly
>>> _was_
>>>>>> thinking about AMD users in that PR.
>>>>>> 
>>>>>> On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland <
>>>>>> [email protected]> wrote:
>>>>>> 
>>>>>>> From my perspective we're developing a few features like mf16c
>> and
>>>>> MKLDNN
>>>>>>> integration specifically for Intel CPUs.  It wouldn't hurt to
>> make
>>>> sure
>>>>>>> those changes also run properly on AMD cpus.
>>>>>>> 
>>>>>>> On Thu, Nov 29, 2018, 3:38 PM Hao Jin <[email protected]
>> wrote:
>>>>>>> 
>>>>>>>> I'm a bit confused about why we need extra functionality tests
>>> just
>>>>> for
>>>>>>>> AMD
>>>>>>>> CPUs, aren't AMD CPUs supporting roughly the same instruction
>> sets
>>>> as
>>>>>> the
>>>>>>>> Intel ones? In the very impossible case that something working
>> on
>>>>> Intel
>>>>>>>> CPUs being not functioning on AMD CPUs (or vice versa), it would
>>>>> mostly
>>>>>>>> likely be related to the underlying hardware implementation of
>> the
>>>>> same
>>>>>>>> ISA, to which we definitely do not have a good solution. So I
>>> don't
>>>>>> think
>>>>>>>> performing extra tests on functional aspect of the system on AMD
>>>> CPUs
>>>>> is
>>>>>>>> adding any values.
>>>>>>>> Hao
>>>>>>>> 
>>>>>>>> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu
>>>> <[email protected]
>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> On 11/29/18, 2:39 PM, "Alex Zai" <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>    What are people's thoughts on having AMD machines tested
>> on
>>>> the
>>>>>> CI?
>>>>>>>> AMD
>>>>>>>>>    machines are now available on AWS.
>>>>>>>>> 
>>>>>>>>>    Best,
>>>>>>>>>    Alex
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: Adding AMD CPU to CI

Reply via email to