Looks like a straightforward sequencing issue. We should set
vlib_mains[next_thread_index]->check_frame_queues = 1 after marking the frame
queue element valid / calling vlib_put_frame_queue_elt, not when switching to a
new destination queue.
I'll push a patch for you to test.
HTH... Dave
From: vpp-dev@lists.fd.io On Behalf Of Ni, Hongjun
Sent: Friday, November 8, 2019 1:44 AM
To: vpp-dev@lists.fd.io; Dave Barach (dbarach)
Cc: Noonan, Gordon ; Liu, Yu Y
Subject: [vpp-dev] (VPP-1734) Worker Thread stops reading from its Handoff Queue
Hi folks,
Could some guy help to take a look at this issue:
https://jira.fd.io/browse/VPP-1734
The issue is a side effect of changes introduced by
https://github.com/FDio/vpp/commit/80965f599aa90288c8c139e7e3a31726b89eb9a4#diff-f660570fc2dc57455f2c52b20880bfd8
in VPP 19.04.
It's a difficult / unpredictable issue to reproduce but it when worker handoff
is enabled on VPP and sufficient load is applied,
then after a time (mins/hours) one or more workers will cease handling handoff
traffic and never recover.
Thanks,
Hongjun
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#14544): https://lists.fd.io/g/vpp-dev/message/14544
Mute This Topic: https://lists.fd.io/mt/47321789/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-