On 04/05/17 06:57, Honnappa Nagarahalli wrote:
> This can go into master/api-next as an independent patch. Agree?
agree. If we accept implementation where events can be 'delayed' than it
looks like we missed some api to sync queues.
But I do not see why we need this patch. On the same cpu test queue 1
event and after that dequeue 1 event:
for (i = 0; i < QUEUE_ROUNDS; i++) {
ev = odp_buffer_to_event(buf);
if (odp_queue_enq(queue, ev)) {
LOG_ERR(" [%i] Queue enqueue failed.\n", thr);
odp_buffer_free(buf);
return -1;
}
ev = odp_queue_deq(queue);
buf = odp_buffer_from_event(ev);
if (!odp_buffer_is_valid(buf)) {
LOG_ERR(" [%i] Queue empty.\n", thr);
return -1;
}
}
Where this exactly event can be delayed?
If other threads do the same - then all do enqueue 1 event first and
then dequeue one event. I can understand problem with queueing on one
cpu and dequeuing on other cpu. But on the same cpu is has to always
work. Isn't it?
Maxim.
>
> On 4 April 2017 at 21:22, Brian Brooks <[email protected]> wrote:
>> On 04/04 17:26:12, Bill Fischofer wrote:
>>> On Tue, Apr 4, 2017 at 3:37 PM, Brian Brooks <[email protected]> wrote:
>>>> On 04/04 21:59:15, Maxim Uvarov wrote:
>>>>> On 04/04/17 21:47, Brian Brooks wrote:
>>>>>> Signed-off-by: Ola Liljedahl <[email protected]>
>>>>>> Reviewed-by: Brian Brooks <[email protected]>
>>>>>> Reviewed-by: Honnappa Nagarahalli <[email protected]>
>>>>>> Reviewed-by: Kevin Wang <[email protected]>
>>>>>> ---
>>>>>> test/common_plat/performance/odp_scheduling.c | 12 ++++++++++--
>>>>>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/test/common_plat/performance/odp_scheduling.c
>>>>>> b/test/common_plat/performance/odp_scheduling.c
>>>>>> index c74a0713..38e76257 100644
>>>>>> --- a/test/common_plat/performance/odp_scheduling.c
>>>>>> +++ b/test/common_plat/performance/odp_scheduling.c
>>>>>> @@ -273,7 +273,7 @@ static int test_plain_queue(int thr, test_globals_t
>>>>>> *globals)
>>>>>> test_message_t *t_msg;
>>>>>> odp_queue_t queue;
>>>>>> uint64_t c1, c2, cycles;
>>>>>> - int i;
>>>>>> + int i, j;
>>>>>>
>>>>>> /* Alloc test message */
>>>>>> buf = odp_buffer_alloc(globals->pool);
>>>>>> @@ -307,7 +307,15 @@ static int test_plain_queue(int thr, test_globals_t
>>>>>> *globals)
>>>>>> return -1;
>>>>>> }
>>>>>>
>>>>>> - ev = odp_queue_deq(queue);
>>>>>> + /* When enqueue and dequeue are decoupled (e.g. not using a
>>>>>> + * common lock), an enqueued event may not be immediately
>>>>>> + * visible to dequeue. So we just try again for a while. */
>>>>>> + for (j = 0; j < 100; j++) {
>>>>>
>>>>> where 100 number comes from?
>>>>
>>>> It is the retry count. Perhaps it could be a bit lower, or a bit higher,
>>>> but
>>>> it works well.
>>>
>>> Actually, it's incorrect. What happens if all 100 retries fail? You'll
>>> call odp_buffer_from_event() for ODP_EVENT_INVALID, which is
>>> undefined.
>>
>> Incorrect? :) The point is that an event may not be immediately available
>> to dequeue after it has been enqueued. This is due to the way that a
>> concurrent
>> ring buffer behaves in a multi-threaded environment. The approach here is
>> just to retry the dequeue a couple times (100 times actually) before moving
>> on to the rest of code. Perhaps 100 times is too many times, but some amount
>> of retry is needed.
>>
>> If this is not desirable, then I think it would be more accurate to consider
>> odp_queue_enq() / odp_queue_deq() as async APIs -or- MT-unsafe (must be
>> called
>> from one thread at a time in order to ensure the behavior that an event is
>> immediately available for dequeue once it has been enqueued).
>>
>>>>
>>>>> Maxim.
>>>>>
>>>>>> + ev = odp_queue_deq(queue);
>>>>>> + if (ev != ODP_EVENT_INVALID)
>>>>>> + break;
>>>>>> + odp_cpu_pause();
>>>>>> + }
>>>>>>
>>>>>> buf = odp_buffer_from_event(ev);
>>>>>>
>>>>>>
>>>>>