On 05/04/2017, 12:36, "Dmitry Eremin-Solenikov" <[email protected]> wrote:
>On 05.04.2017 02:31, Ola Liljedahl wrote: >> On 05/04/2017, 01:25, "Dmitry Eremin-Solenikov" >> <[email protected]> wrote: >>> On 04.04.2017 23:52, Ola Liljedahl wrote: >>>> Sending from my ARM email account, I hope Outlook does not mess up the >>>> format. >>>> >>>> >>>> >>>> On 04/04/2017, 22:21, "Dmitry Eremin-Solenikov" >>>> <[email protected]> wrote: >>>> >>>>> On 04.04.2017 21:48, Brian Brooks wrote: >>>>>> Signed-off-by: Ola Liljedahl <[email protected]> >>>>>> Reviewed-by: Brian Brooks <[email protected]> >>>>>> Reviewed-by: Honnappa Nagarahalli <[email protected]> >>>>> >>>>>> >>>>>> >>>>>> >>>>>>+/******************************************************************* >>>>>>** >>>>>> ** >>>>>> ******* >>>>>> + * bitset abstract data type >>>>>> + >>>>>> >>>>>> >>>>>>********************************************************************* >>>>>>** >>>>>> ** >>>>>> ****/ >>>>>> +/* This could be a struct of scalars to support larger bit sets */ >>>>>> + >>>>>> +#if ATOM_BITSET_SIZE <= 32 >>>>> >>>>> Maybe I missed, where did you set this macro? >>>> In odp_config_internal.h >>>> It is a build time configuration. >>>> >>>>> >>>>> Also, why do you need several versions of bitset? Can you stick to >>>>>one >>>>> size that fits all? >>>> Some 32-bit archs (ARMv7a, x86) will only support 64-bit atomics >>>> (AFAIK). >>>> Only x86-64 and ARMv8a supports 128-bit atomics (and compiler support >>>> for >>>> 128-bit atomics for ARMv8a is a bit lackingÅ ). >>>> Other architectures might only support 32-bit atomic operations. >>> >>> What will be the major outcome of settling on the 64-bit atomics? >> The size of the bitset determines the maximum number of threads, the >> maximum number of scheduler groups and the maximum number of reorder >> contexts (per thread). > >Then even 128 can become too small in the forthcoming future. As far as >I understand, most of the interesting things happen around >bitsetting/clearing. Maybe we can redefine bitset as a struct or array >of atomics? Then it would be expandable without significant software >issues, wouldn't it? > >I'm trying to get away of situation where we have overcomplicated low >level code, which brings different issues on further platforms (like >supporting this amount of threads on ARM and that amount of threads on >x86/PPC/MIPS/etc). I think the current implementation is simple and efficient. I also think it is sufficiently capable, e.g. supports up to 128 threads/scheduler groups etc. on 64-bit ARM and x86, up to 64 on 32-bit ARM/x86 and 64-bit MIPS. I don't think we should make a more complicated generic implementation until the need has surfaced. It is easy to over-speculate in what will be required in the future and implement stuff that is never used. > >>>> I think the user should have control over this but if you think that >>>>we >>>> should just select the max value that is supported by the architecture >>>> in >>>> question and thus skip one build configuration, I am open to this. We >>>> will >>>> still need separate versions for 32/64/128 bits because there are >>>>slight >>>> differences in the syntax and implementation. Such are the vagaries of >>>> the >>>> C standard (and GCC extensions). >>>> >>>> >>>>> Any real reason for the following defines? Why do you need them? >>>> The functions were added as they were needed, e.g. in >>>> odp_schedule_scalable.c. >>>> I dont think there is anyone which is not used anymore but can >>>> double-check that. >>> >>> Well. I maybe should rephrase my question: why do you think that it's >>> better to have bitset_andn(a, b), rather than just a &~b ? >> The atomic bitset is an abstract data type. The implementation does not >> have to use a scalar word. Alternative implementation paths exist, e.g. >> use a struct with multiple words and perform the requested operation one >> word at a time (this is OK but perhaps not well documented). > >This makes sense, esp. if we add non-plain-integer bitsets. One note on using a struct with multiple words is that this will/might in some cases require multiple atomic operations (one per word) and this will be slower. > > >-- >With best wishes >Dmitry
