On 05.04.2017 16:33, Ola Liljedahl wrote:
> 
> 
> 
> 
> On 05/04/2017, 15:22, "Dmitry Eremin-Solenikov"
> <dmitry.ereminsoleni...@linaro.org> wrote:
> 
>> On 05.04.2017 15:16, Ola Liljedahl wrote:
>>> On 05/04/2017, 12:36, "Dmitry Eremin-Solenikov"
>>> <dmitry.ereminsoleni...@linaro.org> wrote:
>>>
>>>> On 05.04.2017 02:31, Ola Liljedahl wrote:
>>>>> On 05/04/2017, 01:25, "Dmitry Eremin-Solenikov"
>>>>> <dmitry.ereminsoleni...@linaro.org> wrote:
>>>>>> On 04.04.2017 23:52, Ola Liljedahl wrote:
>>>>>>> Sending from my ARM email account, I hope Outlook does not mess up
>>>>>>> the
>>>>>>> format.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 04/04/2017, 22:21, "Dmitry Eremin-Solenikov"
>>>>>>> <dmitry.ereminsoleni...@linaro.org> wrote:
>>>>>>>
>>>>>>>> On 04.04.2017 21:48, Brian Brooks wrote:
>>>>>>>>> Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
>>>>>>>>> Reviewed-by: Brian Brooks <brian.bro...@arm.com>
>>>>>>>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +/*****************************************************************
>>>>>>>>> **
>>>>>>>>> **
>>>>>>>>> **
>>>>>>>>> *******
>>>>>>>>> + * bitset abstract data type
>>>>>>>>> +
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *******************************************************************
>>>>>>>>> **
>>>>>>>>> **
>>>>>>>>> **
>>>>>>>>> ****/
>>>>>>>>> +/* This could be a struct of scalars to support larger bit sets
>>>>>>>>> */
>>>>>>>>> +
>>>>>>>>> +#if ATOM_BITSET_SIZE <= 32
>>>>>>>>
>>>>>>>> Maybe I missed, where did you set this macro?
>>>>>>> In odp_config_internal.h
>>>>>>> It is a build time configuration.
>>>>>>>
>>>>>>>>
>>>>>>>> Also, why do you need several versions of bitset? Can you stick to
>>>>>>>> one
>>>>>>>> size that fits all?
>>>>>>> Some 32-bit archs (ARMv7a, x86) will only support 64-bit atomics
>>>>>>> (AFAIK).
>>>>>>> Only x86-64 and ARMv8a supports 128-bit atomics (and compiler
>>>>>>> support
>>>>>>> for
>>>>>>> 128-bit atomics for ARMv8a is a bit lackingÅ ).
>>>>>>> Other architectures might only support 32-bit atomic operations.
>>>>>>
>>>>>> What will be the major outcome of settling on the 64-bit atomics?
>>>>> The size of the bitset determines the maximum number of threads, the
>>>>> maximum number of scheduler groups and the maximum number of reorder
>>>>> contexts (per thread).
>>>>
>>>> Then even 128 can become too small in the forthcoming future. As far as
>>>> I understand, most of the interesting things happen around
>>>> bitsetting/clearing. Maybe we can redefine bitset as a struct or array
>>>> of atomics? Then it would be expandable without significant software
>>>> issues, wouldn't it?
>>>>
>>>> I'm trying to get away of situation where we have overcomplicated low
>>>> level code, which brings different issues on further platforms (like
>>>> supporting this amount of threads on ARM and that amount of threads on
>>>> x86/PPC/MIPS/etc).
>>> I think the current implementation is simple and efficient. I also
>>> think it
>>> is sufficiently capable, e.g. supports up to 128 threads/scheduler
>>> groups
>>> etc.
>>
>> With 96 cores on existing boards, 128 seems quite like a close limit.
> The limit imposed by bitset_t is the number of threads (CPU's) in one ODP
> application. It is not a platform or system limit.
> 
> How likely is it that all of those 96 cores will be executing the same ODP
> application?

That depends on the exact customer's view.

> I doubt anyone wants to have a ODP app spanning more than one socket,
> consider the inter-socket latency on current multi-socket capable SoC's.

Just two sockets. Sorry. I start to sound like an advertisement. I'll to
stop that. But '128 threads' really sounds like '640k ought to be enough
for everybody'. Let's work for scalable generic solution.

>>> on 64-bit ARM and x86, up to 64 on 32-bit ARM/x86 and 64-bit MIPS. I
>>> don't
>>> think we should make a more complicated generic implementation until the
>>> need has surfaced. It is easy to over-speculate in what will be
>>> required in
>>> the future and implement stuff that is never used.
>>
>> It is already overcomplicated.
> What do you think is overcomplicated? I think the code is very simple.
> Only one or two functions have more than one line of C code in them.
> 
>> It is a nice scientific solution,
> "scientific"?

Yep. I used the same work, when we discussed ipfrag reassembly. You have
nice, fast and ideal solutions, but they are hard to be understood and
maintained by other people. Thus I'm asking for understandable generic C
solution, which is further optimized by your code.


-- 
With best wishes
Dmitry

Reply via email to