On 05/04/2017, 12:36, "Dmitry Eremin-Solenikov"
<[email protected]> wrote:

>On 05.04.2017 02:31, Ola Liljedahl wrote:
>> On 05/04/2017, 01:25, "Dmitry Eremin-Solenikov"
>> <[email protected]> wrote:
>>> On 04.04.2017 23:52, Ola Liljedahl wrote:
>>>> Sending from my ARM email account, I hope Outlook does not mess up the
>>>> format.
>>>>
>>>>
>>>>
>>>> On 04/04/2017, 22:21, "Dmitry Eremin-Solenikov"
>>>> <[email protected]> wrote:
>>>>
>>>>> On 04.04.2017 21:48, Brian Brooks wrote:
>>>>>> Signed-off-by: Ola Liljedahl <[email protected]>
>>>>>> Reviewed-by: Brian Brooks <[email protected]>
>>>>>> Reviewed-by: Honnappa Nagarahalli <[email protected]>
>>>>>
>>>>>>
>>>>>>
>>>>>> 
>>>>>>+/*******************************************************************
>>>>>>**
>>>>>> **
>>>>>> *******
>>>>>> + * bitset abstract data type
>>>>>> +
>>>>>>
>>>>>> 
>>>>>>*********************************************************************
>>>>>>**
>>>>>> **
>>>>>> ****/
>>>>>> +/* This could be a struct of scalars to support larger bit sets */
>>>>>> +
>>>>>> +#if ATOM_BITSET_SIZE <= 32
>>>>>
>>>>> Maybe I missed, where did you set this macro?
>>>> In odp_config_internal.h
>>>> It is a build time configuration.
>>>>
>>>>>
>>>>> Also, why do you need several versions of bitset? Can you stick to
>>>>>one
>>>>> size that fits all?
>>>> Some 32-bit archs (ARMv7a, x86) will only support 64-bit atomics
>>>> (AFAIK).
>>>> Only x86-64 and ARMv8a supports 128-bit atomics (and compiler support
>>>> for
>>>> 128-bit atomics for ARMv8a is a bit lackingÅ ).
>>>> Other architectures might only support 32-bit atomic operations.
>>>
>>> What will be the major outcome of settling on the 64-bit atomics?
>> The size of the bitset determines the maximum number of threads, the
>> maximum number of scheduler groups and the maximum number of reorder
>> contexts (per thread).
>
>Then even 128 can become too small in the forthcoming future. As far as
>I understand, most of the interesting things happen around
>bitsetting/clearing. Maybe we can redefine bitset as a struct or array
>of atomics? Then it would be expandable without significant software
>issues, wouldn't it?
>
>I'm trying to get away of situation where we have overcomplicated low
>level code, which brings different issues on further platforms (like
>supporting this amount of threads on ARM and that amount of threads on
>x86/PPC/MIPS/etc).
I think the current implementation is simple and efficient. I also think it
is sufficiently capable, e.g. supports up to 128 threads/scheduler groups
etc.
on 64-bit ARM and x86, up to 64 on 32-bit ARM/x86 and 64-bit MIPS. I don't
think we should make a more complicated generic implementation until the
need has surfaced. It is easy to over-speculate in what will be required in
the future and implement stuff that is never used.

>
>>>> I think the user should have control over this but if you think that
>>>>we
>>>> should just select the max value that is supported by the architecture
>>>> in
>>>> question and thus skip one build configuration, I am open to this. We
>>>> will
>>>> still need separate versions for 32/64/128 bits because there are
>>>>slight
>>>> differences in the syntax and implementation. Such are the vagaries of
>>>> the
>>>> C standard (and GCC extensions).
>>>>
>>>>
>>>>> Any real reason for the following defines? Why do you need them?
>>>> The functions were added as they were needed, e.g. in
>>>> odp_schedule_scalable.c.
>>>> I dont think there is anyone which is not used anymore but can
>>>> double-check that.
>>>
>>> Well. I maybe should rephrase my question: why do you think that it's
>>> better to have bitset_andn(a, b), rather than just a &~b ?
>> The atomic bitset is an abstract data type. The implementation does not
>> have to use a scalar word. Alternative implementation paths exist, e.g.
>> use a struct with multiple words and perform the requested operation one
>> word at a time (this is OK but perhaps not well documented).
>
>This makes sense, esp. if we add non-plain-integer bitsets.
One note on using a struct with multiple words is that this will/might in
some cases
require multiple atomic operations (one per word) and this will be slower.

>
>
>-- 
>With best wishes
>Dmitry

Reply via email to