On 05.04.2017 15:16, Ola Liljedahl wrote:
> On 05/04/2017, 12:36, "Dmitry Eremin-Solenikov"
> <dmitry.ereminsoleni...@linaro.org> wrote:
> 
>> On 05.04.2017 02:31, Ola Liljedahl wrote:
>>> On 05/04/2017, 01:25, "Dmitry Eremin-Solenikov"
>>> <dmitry.ereminsoleni...@linaro.org> wrote:
>>>> On 04.04.2017 23:52, Ola Liljedahl wrote:
>>>>> Sending from my ARM email account, I hope Outlook does not mess up the
>>>>> format.
>>>>>
>>>>>
>>>>>
>>>>> On 04/04/2017, 22:21, "Dmitry Eremin-Solenikov"
>>>>> <dmitry.ereminsoleni...@linaro.org> wrote:
>>>>>
>>>>>> On 04.04.2017 21:48, Brian Brooks wrote:
>>>>>>> Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
>>>>>>> Reviewed-by: Brian Brooks <brian.bro...@arm.com>
>>>>>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> +/*******************************************************************
>>>>>>> **
>>>>>>> **
>>>>>>> *******
>>>>>>> + * bitset abstract data type
>>>>>>> +
>>>>>>>
>>>>>>>
>>>>>>> *********************************************************************
>>>>>>> **
>>>>>>> **
>>>>>>> ****/
>>>>>>> +/* This could be a struct of scalars to support larger bit sets */
>>>>>>> +
>>>>>>> +#if ATOM_BITSET_SIZE <= 32
>>>>>>
>>>>>> Maybe I missed, where did you set this macro?
>>>>> In odp_config_internal.h
>>>>> It is a build time configuration.
>>>>>
>>>>>>
>>>>>> Also, why do you need several versions of bitset? Can you stick to
>>>>>> one
>>>>>> size that fits all?
>>>>> Some 32-bit archs (ARMv7a, x86) will only support 64-bit atomics
>>>>> (AFAIK).
>>>>> Only x86-64 and ARMv8a supports 128-bit atomics (and compiler support
>>>>> for
>>>>> 128-bit atomics for ARMv8a is a bit lackingÅ ).
>>>>> Other architectures might only support 32-bit atomic operations.
>>>>
>>>> What will be the major outcome of settling on the 64-bit atomics?
>>> The size of the bitset determines the maximum number of threads, the
>>> maximum number of scheduler groups and the maximum number of reorder
>>> contexts (per thread).
>>
>> Then even 128 can become too small in the forthcoming future. As far as
>> I understand, most of the interesting things happen around
>> bitsetting/clearing. Maybe we can redefine bitset as a struct or array
>> of atomics? Then it would be expandable without significant software
>> issues, wouldn't it?
>>
>> I'm trying to get away of situation where we have overcomplicated low
>> level code, which brings different issues on further platforms (like
>> supporting this amount of threads on ARM and that amount of threads on
>> x86/PPC/MIPS/etc).
> I think the current implementation is simple and efficient. I also think it
> is sufficiently capable, e.g. supports up to 128 threads/scheduler groups
> etc.

With 96 cores on existing boards, 128 seems quite like a close limit.

> on 64-bit ARM and x86, up to 64 on 32-bit ARM/x86 and 64-bit MIPS. I don't
> think we should make a more complicated generic implementation until the
> need has surfaced. It is easy to over-speculate in what will be required in
> the future and implement stuff that is never used.

It is already overcomplicated. It is a nice scientific solution, it
might be high performance, but it is a bit too complicated for generic
code. I have the feeling that it can find path in odp-cloud, but for
odp/linux-generic we need (IMO) initially a simple code.


-- 
With best wishes
Dmitry

Reply via email to