On 05.04.2017 15:16, Ola Liljedahl wrote: > On 05/04/2017, 12:36, "Dmitry Eremin-Solenikov" > <dmitry.ereminsoleni...@linaro.org> wrote: > >> On 05.04.2017 02:31, Ola Liljedahl wrote: >>> On 05/04/2017, 01:25, "Dmitry Eremin-Solenikov" >>> <dmitry.ereminsoleni...@linaro.org> wrote: >>>> On 04.04.2017 23:52, Ola Liljedahl wrote: >>>>> Sending from my ARM email account, I hope Outlook does not mess up the >>>>> format. >>>>> >>>>> >>>>> >>>>> On 04/04/2017, 22:21, "Dmitry Eremin-Solenikov" >>>>> <dmitry.ereminsoleni...@linaro.org> wrote: >>>>> >>>>>> On 04.04.2017 21:48, Brian Brooks wrote: >>>>>>> Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com> >>>>>>> Reviewed-by: Brian Brooks <brian.bro...@arm.com> >>>>>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> +/******************************************************************* >>>>>>> ** >>>>>>> ** >>>>>>> ******* >>>>>>> + * bitset abstract data type >>>>>>> + >>>>>>> >>>>>>> >>>>>>> ********************************************************************* >>>>>>> ** >>>>>>> ** >>>>>>> ****/ >>>>>>> +/* This could be a struct of scalars to support larger bit sets */ >>>>>>> + >>>>>>> +#if ATOM_BITSET_SIZE <= 32 >>>>>> >>>>>> Maybe I missed, where did you set this macro? >>>>> In odp_config_internal.h >>>>> It is a build time configuration. >>>>> >>>>>> >>>>>> Also, why do you need several versions of bitset? Can you stick to >>>>>> one >>>>>> size that fits all? >>>>> Some 32-bit archs (ARMv7a, x86) will only support 64-bit atomics >>>>> (AFAIK). >>>>> Only x86-64 and ARMv8a supports 128-bit atomics (and compiler support >>>>> for >>>>> 128-bit atomics for ARMv8a is a bit lackingÅ ). >>>>> Other architectures might only support 32-bit atomic operations. >>>> >>>> What will be the major outcome of settling on the 64-bit atomics? >>> The size of the bitset determines the maximum number of threads, the >>> maximum number of scheduler groups and the maximum number of reorder >>> contexts (per thread). >> >> Then even 128 can become too small in the forthcoming future. As far as >> I understand, most of the interesting things happen around >> bitsetting/clearing. Maybe we can redefine bitset as a struct or array >> of atomics? Then it would be expandable without significant software >> issues, wouldn't it? >> >> I'm trying to get away of situation where we have overcomplicated low >> level code, which brings different issues on further platforms (like >> supporting this amount of threads on ARM and that amount of threads on >> x86/PPC/MIPS/etc). > I think the current implementation is simple and efficient. I also think it > is sufficiently capable, e.g. supports up to 128 threads/scheduler groups > etc.
With 96 cores on existing boards, 128 seems quite like a close limit. > on 64-bit ARM and x86, up to 64 on 32-bit ARM/x86 and 64-bit MIPS. I don't > think we should make a more complicated generic implementation until the > need has surfaced. It is easy to over-speculate in what will be required in > the future and implement stuff that is never used. It is already overcomplicated. It is a nice scientific solution, it might be high performance, but it is a bit too complicated for generic code. I have the feeling that it can find path in odp-cloud, but for odp/linux-generic we need (IMO) initially a simple code. -- With best wishes Dmitry