Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread LIU Yulong via discuss
Hi Eelco,

Thank you.

Patch sent to the mail list:
https://mail.openvswitch.org/pipermail/ovs-dev/2024-March/412474.html



On Wed, Mar 13, 2024 at 5:34 PM Eelco Chaudron  wrote:
>
>
>
> On 13 Mar 2024, at 10:19, LIU Yulong wrote:
>
> > Hi guys,
> >
> > Send a pull request with that try_lock movement fix based on the former 
> > tests:
> > https://github.com/openvswitch/ovs/pull/421
> >
> > Does that make sense to you?
>
> I’m a bit behind emails, etc. so did not look at your emails yet. But for OVS 
> we use an email-based workflow, see here; 
> https://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/.
>  If you use this more people will see your patch and can review it.
>
> Cheers,
>
> Eelco
>
> >
> > On Tue, Mar 12, 2024 at 3:11 PM LIU Yulong  wrote:
> >>
> >> Updates:
> >>
> >> Ukey attributes we already have:
> >>
> >> long long int created OVS_GUARDED;/* Estimate of creation 
> >> time. */
> >> unsigned int state_thread OVS_GUARDED;/* Thread that transitions. 
> >> */
> >>
> >> Added more attributes [1] to the ukey:
> >>
> >> const char *state_before OVS_GUARDED;  /* locator state before
> >> (last) transition. */
> >> long long int modified; /* Time of last transition. */
> >> unsigned create_tid;/* Ukey created thread id. */
> >>
> >> [1] 
> >> https://github.com/gotostack/ovs/commit/8ddc4f512783e6b883b102b821e0f05916a9c255
> >>
> >> After that, a core file shows:
> >>
> >> 1) The pmd ctx-> now:
> >> p  ((struct dp_netdev_pmd_thread *) 0x7f804b733010)->ctx
> >> $10 = {now = 12529082556818, last_rxq = 0x55f009029720, emc_insert_min
> >> = 42949672, smc_enable_db = false}
> >>
> >> 2)ukey in the core code call stack
> >> p * (struct udpif_key *) 0x7f803c360710
> >> $11 = { created = 12529082056, modified = 12529082553, create_tid = 9}
> >>
> >> 3) Circular buffer same address for free action
> >> ukey_addr = 0x7f803c360710, timestamp = 12529082556703
> >>
> >> PMD cxt->now 12529082556818 is near the ukey free time 12529082556703,
> >> it's about 115us.
> >>
> >> Adding more timesmap [2] to every ukey state to record the ukey state
> >> transition:
> >> long long int ukey_create_time;/* Time of ukey creation. */
> >> long long int ukey_visible_time; /* Time of ukey visible. */
> >> long long int ukey_operational_time; /* Time of ukey operational. */
> >> long long int ukey_evicting_time;/* Time of ukey evicting. */
> >> long long int ukey_evicted_time; /* Time of ukey evicted. */
> >> long long int ukey_deleted_time; /* Time of ukey deleted. */
> >> long long int ukey_destroy_time; /* Time of ukey destroy. */
> >> long long int ukey_replace_time; /* Time of ukey replace. */
> >>
> >> [2] 
> >> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050
> >>
> >> And a core file shows:
> >>
> >>   ukey_create_time = 13217283578366,
> >>   ukey_visible_time = 13217283578366,
> >>   ukey_operational_time = 13217283583044,
> >>   ukey_evicting_time = 13217289145192,
> >>   ukey_evicted_time = 13217289145245,
> >>   ukey_deleted_time = 13217289154654,
> >>   ukey_destroy_time = 13217289156490,  This is set just before the
> >> ovs_mutex_destroy(>mutex);
> >>   ukey_replace_time = 13217289154654
> >>
> >> pmd->ctx:
> >> $4 = {
> >>   now = 13217289156482,
> >>   last_rxq = 0x55b34db74f50,
> >>   emc_insert_min = 42949672,
> >>   smc_enable_db = false
> >> }
> >>
> >> ukey_replace_time and ukey_deleted_time are the same.
> >>
> >> ukey_destroy_time  - pmd-ctx.now = 8 (13217289156490 - 13217289156482)
> >>
> >> And also added a seep_now just before the mostly core code line:
> >> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050#diff-be6e2339300cb2a7efa8eca531a668a94ce9f06dd717ba73bb1b508fee27e887R3030
> >> sweep_now = time_usec();
> >> if (ovs_mutex_trylock(>mutex)) {
> >> continue;
> >> }
> >>
> >> ukey_destroy_time  - sweep_now = -78 (13217289156490 - 13217289156568)
> >>
> >> Means that ukey_destory is a bit earlier than revalidator_sweep__ try_lock.
> >>
> >>
> >>
> >> According to these informations, I assume that the umap and ukey
> >> iteration has race condition between
> >> PMD thread, RCU thread and the revalidator thread. And based on the
> >> core/abort point in the code
> >> stack. I moved the umap lock to outside of CMAP_FOR_EACH loop [3].
> >> [3] 
> >> https://github.com/gotostack/ovs/commit/2919a242be7d0ee079c278a8488188694f20f827
> >>
> >> No more core was seen during that revalidator_sweep__ procedure for 4 days 
> >> now.
> >>
> >> But if I revert this lock movement, the core can show again in a few hours.
> >>
> >> So, please take a look at this lock movement patch, if it make sense to 
> >> you.
> >>
> >>
> >> Regards,
> >>
> >> LIU Yulong
> >>
> >>
> >> On Fri, Mar 1, 2024 at 6:06 PM LIU Yulong  wrote:
> >>>
> >>> Hi,
> >>>
> >>> Add some updates:
> 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread Eelco Chaudron via discuss


On 13 Mar 2024, at 10:19, LIU Yulong wrote:

> Hi guys,
>
> Send a pull request with that try_lock movement fix based on the former tests:
> https://github.com/openvswitch/ovs/pull/421
>
> Does that make sense to you?

I’m a bit behind emails, etc. so did not look at your emails yet. But for OVS 
we use an email-based workflow, see here; 
https://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/.
 If you use this more people will see your patch and can review it.

Cheers,

Eelco

>
> On Tue, Mar 12, 2024 at 3:11 PM LIU Yulong  wrote:
>>
>> Updates:
>>
>> Ukey attributes we already have:
>>
>> long long int created OVS_GUARDED;/* Estimate of creation time. 
>> */
>> unsigned int state_thread OVS_GUARDED;/* Thread that transitions. */
>>
>> Added more attributes [1] to the ukey:
>>
>> const char *state_before OVS_GUARDED;  /* locator state before
>> (last) transition. */
>> long long int modified; /* Time of last transition. */
>> unsigned create_tid;/* Ukey created thread id. */
>>
>> [1] 
>> https://github.com/gotostack/ovs/commit/8ddc4f512783e6b883b102b821e0f05916a9c255
>>
>> After that, a core file shows:
>>
>> 1) The pmd ctx-> now:
>> p  ((struct dp_netdev_pmd_thread *) 0x7f804b733010)->ctx
>> $10 = {now = 12529082556818, last_rxq = 0x55f009029720, emc_insert_min
>> = 42949672, smc_enable_db = false}
>>
>> 2)ukey in the core code call stack
>> p * (struct udpif_key *) 0x7f803c360710
>> $11 = { created = 12529082056, modified = 12529082553, create_tid = 9}
>>
>> 3) Circular buffer same address for free action
>> ukey_addr = 0x7f803c360710, timestamp = 12529082556703
>>
>> PMD cxt->now 12529082556818 is near the ukey free time 12529082556703,
>> it's about 115us.
>>
>> Adding more timesmap [2] to every ukey state to record the ukey state
>> transition:
>> long long int ukey_create_time;/* Time of ukey creation. */
>> long long int ukey_visible_time; /* Time of ukey visible. */
>> long long int ukey_operational_time; /* Time of ukey operational. */
>> long long int ukey_evicting_time;/* Time of ukey evicting. */
>> long long int ukey_evicted_time; /* Time of ukey evicted. */
>> long long int ukey_deleted_time; /* Time of ukey deleted. */
>> long long int ukey_destroy_time; /* Time of ukey destroy. */
>> long long int ukey_replace_time; /* Time of ukey replace. */
>>
>> [2] 
>> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050
>>
>> And a core file shows:
>>
>>   ukey_create_time = 13217283578366,
>>   ukey_visible_time = 13217283578366,
>>   ukey_operational_time = 13217283583044,
>>   ukey_evicting_time = 13217289145192,
>>   ukey_evicted_time = 13217289145245,
>>   ukey_deleted_time = 13217289154654,
>>   ukey_destroy_time = 13217289156490,  This is set just before the
>> ovs_mutex_destroy(>mutex);
>>   ukey_replace_time = 13217289154654
>>
>> pmd->ctx:
>> $4 = {
>>   now = 13217289156482,
>>   last_rxq = 0x55b34db74f50,
>>   emc_insert_min = 42949672,
>>   smc_enable_db = false
>> }
>>
>> ukey_replace_time and ukey_deleted_time are the same.
>>
>> ukey_destroy_time  - pmd-ctx.now = 8 (13217289156490 - 13217289156482)
>>
>> And also added a seep_now just before the mostly core code line:
>> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050#diff-be6e2339300cb2a7efa8eca531a668a94ce9f06dd717ba73bb1b508fee27e887R3030
>> sweep_now = time_usec();
>> if (ovs_mutex_trylock(>mutex)) {
>> continue;
>> }
>>
>> ukey_destroy_time  - sweep_now = -78 (13217289156490 - 13217289156568)
>>
>> Means that ukey_destory is a bit earlier than revalidator_sweep__ try_lock.
>>
>>
>>
>> According to these informations, I assume that the umap and ukey
>> iteration has race condition between
>> PMD thread, RCU thread and the revalidator thread. And based on the
>> core/abort point in the code
>> stack. I moved the umap lock to outside of CMAP_FOR_EACH loop [3].
>> [3] 
>> https://github.com/gotostack/ovs/commit/2919a242be7d0ee079c278a8488188694f20f827
>>
>> No more core was seen during that revalidator_sweep__ procedure for 4 days 
>> now.
>>
>> But if I revert this lock movement, the core can show again in a few hours.
>>
>> So, please take a look at this lock movement patch, if it make sense to you.
>>
>>
>> Regards,
>>
>> LIU Yulong
>>
>>
>> On Fri, Mar 1, 2024 at 6:06 PM LIU Yulong  wrote:
>>>
>>> Hi,
>>>
>>> Add some updates:
>>>
>>> 1.
>>> We added a debug attribute `state_before ` to the ukey to record more
>>> life cycle details of  a ukey:
>>> state_where = 0x55576027b868 "ofproto/ofproto-dpif-upcall.c:",
>>> [1], it is UKEY_DELETED.
>>> state_before = 0x55576027b630 "ofproto/ofproto-dpif-upcall.c:",
>>> [2], it was UKEY_EVICTED.
>>>
>>> [1] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
>>> [2] 
>>> 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread LIU Yulong via discuss
Hi guys,

Send a pull request with that try_lock movement fix based on the former tests:
https://github.com/openvswitch/ovs/pull/421

Does that make sense to you?

Thank you.

LIU Yulong

On Tue, Mar 12, 2024 at 3:11 PM LIU Yulong  wrote:
>
> Updates:
>
> Ukey attributes we already have:
>
> long long int created OVS_GUARDED;/* Estimate of creation time. */
> unsigned int state_thread OVS_GUARDED;/* Thread that transitions. */
>
> Added more attributes [1] to the ukey:
>
> const char *state_before OVS_GUARDED;  /* locator state before
> (last) transition. */
> long long int modified; /* Time of last transition. */
> unsigned create_tid;/* Ukey created thread id. */
>
> [1] 
> https://github.com/gotostack/ovs/commit/8ddc4f512783e6b883b102b821e0f05916a9c255
>
> After that, a core file shows:
>
> 1) The pmd ctx-> now:
> p  ((struct dp_netdev_pmd_thread *) 0x7f804b733010)->ctx
> $10 = {now = 12529082556818, last_rxq = 0x55f009029720, emc_insert_min
> = 42949672, smc_enable_db = false}
>
> 2)ukey in the core code call stack
> p * (struct udpif_key *) 0x7f803c360710
> $11 = { created = 12529082056, modified = 12529082553, create_tid = 9}
>
> 3) Circular buffer same address for free action
> ukey_addr = 0x7f803c360710, timestamp = 12529082556703
>
> PMD cxt->now 12529082556818 is near the ukey free time 12529082556703,
> it's about 115us.
>
> Adding more timesmap [2] to every ukey state to record the ukey state
> transition:
> long long int ukey_create_time;/* Time of ukey creation. */
> long long int ukey_visible_time; /* Time of ukey visible. */
> long long int ukey_operational_time; /* Time of ukey operational. */
> long long int ukey_evicting_time;/* Time of ukey evicting. */
> long long int ukey_evicted_time; /* Time of ukey evicted. */
> long long int ukey_deleted_time; /* Time of ukey deleted. */
> long long int ukey_destroy_time; /* Time of ukey destroy. */
> long long int ukey_replace_time; /* Time of ukey replace. */
>
> [2] 
> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050
>
> And a core file shows:
>
>   ukey_create_time = 13217283578366,
>   ukey_visible_time = 13217283578366,
>   ukey_operational_time = 13217283583044,
>   ukey_evicting_time = 13217289145192,
>   ukey_evicted_time = 13217289145245,
>   ukey_deleted_time = 13217289154654,
>   ukey_destroy_time = 13217289156490,  This is set just before the
> ovs_mutex_destroy(>mutex);
>   ukey_replace_time = 13217289154654
>
> pmd->ctx:
> $4 = {
>   now = 13217289156482,
>   last_rxq = 0x55b34db74f50,
>   emc_insert_min = 42949672,
>   smc_enable_db = false
> }
>
> ukey_replace_time and ukey_deleted_time are the same.
>
> ukey_destroy_time  - pmd-ctx.now = 8 (13217289156490 - 13217289156482)
>
> And also added a seep_now just before the mostly core code line:
> https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050#diff-be6e2339300cb2a7efa8eca531a668a94ce9f06dd717ba73bb1b508fee27e887R3030
> sweep_now = time_usec();
> if (ovs_mutex_trylock(>mutex)) {
> continue;
> }
>
> ukey_destroy_time  - sweep_now = -78 (13217289156490 - 13217289156568)
>
> Means that ukey_destory is a bit earlier than revalidator_sweep__ try_lock.
>
>
>
> According to these informations, I assume that the umap and ukey
> iteration has race condition between
> PMD thread, RCU thread and the revalidator thread. And based on the
> core/abort point in the code
> stack. I moved the umap lock to outside of CMAP_FOR_EACH loop [3].
> [3] 
> https://github.com/gotostack/ovs/commit/2919a242be7d0ee079c278a8488188694f20f827
>
> No more core was seen during that revalidator_sweep__ procedure for 4 days 
> now.
>
> But if I revert this lock movement, the core can show again in a few hours.
>
> So, please take a look at this lock movement patch, if it make sense to you.
>
>
> Regards,
>
> LIU Yulong
>
>
> On Fri, Mar 1, 2024 at 6:06 PM LIU Yulong  wrote:
> >
> > Hi,
> >
> > Add some updates:
> >
> > 1.
> > We added a debug attribute `state_before ` to the ukey to record more
> > life cycle details of  a ukey:
> > state_where = 0x55576027b868 "ofproto/ofproto-dpif-upcall.c:",
> > [1], it is UKEY_DELETED.
> > state_before = 0x55576027b630 "ofproto/ofproto-dpif-upcall.c:",
> > [2], it was UKEY_EVICTED.
> >
> > [1] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
> > [2] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2470
> >
> > Still, all of the ukeys did the replace action.
> >
> > 2. The ukey circular buffer [1] does not work well, the buffer still
> > has {0} after a long time run, and the number is absolutely less than
> > `counter_upcall_ukey_free`.
> > [1] 
> > https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46
> > And, can not add an `allocate` entry to this 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-12 Thread LIU Yulong via discuss
Updates:

Ukey attributes we already have:

long long int created OVS_GUARDED;/* Estimate of creation time. */
unsigned int state_thread OVS_GUARDED;/* Thread that transitions. */

Added more attributes [1] to the ukey:

const char *state_before OVS_GUARDED;  /* locator state before
(last) transition. */
long long int modified; /* Time of last transition. */
unsigned create_tid;/* Ukey created thread id. */

[1] 
https://github.com/gotostack/ovs/commit/8ddc4f512783e6b883b102b821e0f05916a9c255

After that, a core file shows:

1) The pmd ctx-> now:
p  ((struct dp_netdev_pmd_thread *) 0x7f804b733010)->ctx
$10 = {now = 12529082556818, last_rxq = 0x55f009029720, emc_insert_min
= 42949672, smc_enable_db = false}

2)ukey in the core code call stack
p * (struct udpif_key *) 0x7f803c360710
$11 = { created = 12529082056, modified = 12529082553, create_tid = 9}

3) Circular buffer same address for free action
ukey_addr = 0x7f803c360710, timestamp = 12529082556703

PMD cxt->now 12529082556818 is near the ukey free time 12529082556703,
it's about 115us.

Adding more timesmap [2] to every ukey state to record the ukey state
transition:
long long int ukey_create_time;/* Time of ukey creation. */
long long int ukey_visible_time; /* Time of ukey visible. */
long long int ukey_operational_time; /* Time of ukey operational. */
long long int ukey_evicting_time;/* Time of ukey evicting. */
long long int ukey_evicted_time; /* Time of ukey evicted. */
long long int ukey_deleted_time; /* Time of ukey deleted. */
long long int ukey_destroy_time; /* Time of ukey destroy. */
long long int ukey_replace_time; /* Time of ukey replace. */

[2] 
https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050

And a core file shows:

  ukey_create_time = 13217283578366,
  ukey_visible_time = 13217283578366,
  ukey_operational_time = 13217283583044,
  ukey_evicting_time = 13217289145192,
  ukey_evicted_time = 13217289145245,
  ukey_deleted_time = 13217289154654,
  ukey_destroy_time = 13217289156490,  This is set just before the
ovs_mutex_destroy(>mutex);
  ukey_replace_time = 13217289154654

pmd->ctx:
$4 = {
  now = 13217289156482,
  last_rxq = 0x55b34db74f50,
  emc_insert_min = 42949672,
  smc_enable_db = false
}

ukey_replace_time and ukey_deleted_time are the same.

ukey_destroy_time  - pmd-ctx.now = 8 (13217289156490 - 13217289156482)

And also added a seep_now just before the mostly core code line:
https://github.com/gotostack/ovs/commit/38a2b73af4442aa741930b3e4cff32ab7b559050#diff-be6e2339300cb2a7efa8eca531a668a94ce9f06dd717ba73bb1b508fee27e887R3030
sweep_now = time_usec();
if (ovs_mutex_trylock(>mutex)) {
continue;
}

ukey_destroy_time  - sweep_now = -78 (13217289156490 - 13217289156568)

Means that ukey_destory is a bit earlier than revalidator_sweep__ try_lock.



According to these informations, I assume that the umap and ukey
iteration has race condition between
PMD thread, RCU thread and the revalidator thread. And based on the
core/abort point in the code
stack. I moved the umap lock to outside of CMAP_FOR_EACH loop [3].
[3] 
https://github.com/gotostack/ovs/commit/2919a242be7d0ee079c278a8488188694f20f827

No more core was seen during that revalidator_sweep__ procedure for 4 days now.

But if I revert this lock movement, the core can show again in a few hours.

So, please take a look at this lock movement patch, if it make sense to you.


Regards,

LIU Yulong


On Fri, Mar 1, 2024 at 6:06 PM LIU Yulong  wrote:
>
> Hi,
>
> Add some updates:
>
> 1.
> We added a debug attribute `state_before ` to the ukey to record more
> life cycle details of  a ukey:
> state_where = 0x55576027b868 "ofproto/ofproto-dpif-upcall.c:",
> [1], it is UKEY_DELETED.
> state_before = 0x55576027b630 "ofproto/ofproto-dpif-upcall.c:",
> [2], it was UKEY_EVICTED.
>
> [1] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
> [2] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2470
>
> Still, all of the ukeys did the replace action.
>
> 2. The ukey circular buffer [1] does not work well, the buffer still
> has {0} after a long time run, and the number is absolutely less than
> `counter_upcall_ukey_free`.
> [1] 
> https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46
> And, can not add an `allocate` entry to this buffer for "ukey
> xmalloc". The circular buffer
> mutex seems not to work well, core many times at
> `ovs_mutex_unlock(_free_buffer.mutex)`.
>
> 3. Ilya's patch [2] was applied, but I have not seen the abort log for now.
> [2] 
> https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
>
> 4. dump all ukeys from the core file, we noticed that almost all
> UKEY_EVICTED ukeys are changed state at `transition_ukey_at` by the
> revalidator thread.
> (The 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-01 Thread LIU Yulong via discuss
Hi,

Add some updates:

1.
We added a debug attribute `state_before ` to the ukey to record more
life cycle details of  a ukey:
state_where = 0x55576027b868 "ofproto/ofproto-dpif-upcall.c:",
[1], it is UKEY_DELETED.
state_before = 0x55576027b630 "ofproto/ofproto-dpif-upcall.c:",
[2], it was UKEY_EVICTED.

[1] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
[2] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2470

Still, all of the ukeys did the replace action.

2. The ukey circular buffer [1] does not work well, the buffer still
has {0} after a long time run, and the number is absolutely less than
`counter_upcall_ukey_free`.
[1] 
https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46
And, can not add an `allocate` entry to this buffer for "ukey
xmalloc". The circular buffer
mutex seems not to work well, core many times at
`ovs_mutex_unlock(_free_buffer.mutex)`.

3. Ilya's patch [2] was applied, but I have not seen the abort log for now.
[2] 
https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5

4. dump all ukeys from the core file, we noticed that almost all
UKEY_EVICTED ukeys are changed state at `transition_ukey_at` by the
revalidator thread.
(The `state_thread` attribute of the ukey)
But, the core bt shows the related ukey was changed state at PMD thread.
For instance:

 normal ukey and the revalidator thread:
(struct umap *) 0x55cce9556140:
  (struct udpif_key *) 0x7f3aad584a80:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aac24ce20:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aac6526e0:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aad731970:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aac91ce50:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aadd69be0:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3aad759040:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3a8c0d6d50:
state = UKEY_EVICTED
 state_thread = 5
  (struct udpif_key *) 0x7f3a8c851300:
state = UKEY_EVICTED
 state_thread = 5

#8  0x55cce5d7005f in ovsthread_wrapper (aux_=) at
lib/ovs-thread.c:422
auxp = 
aux = {start = 0x55cce5c9c0d0 , arg =
0x55cce9595780, name = "revalidator\000\000\000\000"}
id = 5
subprogram_name = 0x7f3ad8c0 "\020 "
#9  0x7f3af2afee65 in start_thread (arg=0x7f3ae2986700) at
pthread_create.c:307
<<<

 core ukey and PMD thread

p * (struct udpif_key *) 0x7f3aac156e80
$2 = {cmap_node = {next = {p = 0x7f3aaec2b3a0}}, key = 0x7f3aac402810,
key_len = 0, mask = 0x0, mask_len = 172, ufid = {u32 = {256387,
2445823588, 3143939231, 3011838433}, u64 = {lo = 10504732324808489235,
  hi = 12935747573714826399}}, ufid_present = true, hash =
2623373230, pmd_id = 35, mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0,
__elision = 0, __list = {
  __prev = 0x0, __next = 0x0}}, __size = '\000' , "\377\377\377\377", '\000' , __align = 0},
where = 0x0}, stats = {n_packets = 3, n_bytes = 852, used = 871199854,
tcp_flags = 16},
  created = 871199014, dump_seq = 8822382946, reval_seq = 8822381178,
state = UKEY_DELETED, state_thread = 8

PMD thread ID:
#6  0x55cce5d7005f in ovsthread_wrapper (aux_=) at
lib/ovs-thread.c:422
auxp = 
aux = {start = 0x55cce5ce2460 , arg =
0x7f3ab2e6a010, name = "pmd-c35/id:\000:\177\000"}
id = 8
subprogram_name = 0x7f3aac0008c0 "p\v\"\255:\177"
#7  0x7f3af2afee65 in start_thread (arg=0x7f3ae0582700) at
pthread_create.c:307


The running threads are:

# ps -T -o spid,comm  $(pidof ovs-vswitchd)
  SPID COMMAND
100866 ovs-vswitchd
100867 eal-intr-thread
100868 rte_mp_handle
100872 ovs-vswitchd
100873 dpdk_watchdog1
100876 urcu2
100888 ct_clean7
100889 ipf_clean6
100890 hw_offload3
100891 handler4
100892 revalidator5 # 1 revalidator thread
100893 pmd-c03/id:9
100894 pmd-c35/id:8   # Mostly 1 PMD thread is working! Another is idle forever.
100925 vhost_reconn
100926 vhost-events

So, this can prove that there are two threads were trying to
manipulate the same ukey?

* PMD thread replaced the old_ukey and transitioned the state.
* RCU thread freed the ukey mutex.
* The revalidator thread tries to lock the old_ukey mutex.

https://mail.openvswitch.org/pipermail/ovs-discuss/2024-February/052960.html
Any idea to simulate the race contion?

Thank you.

Regards,
LIU Yulong


On Tue, Feb 27, 2024 at 6:14 PM Eelco Chaudron  wrote:
>
>
>
> On 27 Feb 2024, at 9:49, LIU Yulong wrote:
>
> > Yes, that makes sense.
> >
> > Another question is how to distinguish the core at line of
> > ovs_mutex_trylock in revalidator_sweep__ is after the 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-27 Thread Eelco Chaudron via discuss


On 27 Feb 2024, at 9:49, LIU Yulong wrote:

> Yes, that makes sense.
>
> Another question is how to distinguish the core at line of
> ovs_mutex_trylock in revalidator_sweep__ is after the free(ukey),
> since the core trace has no timestamp.

This is hard to figure out without adding a time variable (and make sure it’s 
not optimized out) in revalidator_sweep__()

As you are using OVS-DPDK, you can read the values from the pmd->ctx.now 
context, which should be close.

> This line in the function 'ukey_create__' should be the only place
> where ovs allocated the memory for ukey:
> https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-upcall.c#L1777
>
> Right?

Yes, this should be the only place.

>
> If it is true, I will update the buffer structure, and a
> counter_upcall_ukey_allocate as well.
>
> On Tue, Feb 27, 2024 at 3:34 PM Eelco Chaudron  wrote:
>>
>>
>>
>> On 27 Feb 2024, at 4:44, LIU Yulong wrote:
>>
>>> @Eelco, as you suggested, added such circular buffer to my local OVS:
>>> https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46
>>
>> I should also add allocate logging, or else you might not know if a buffer 
>> was allocated at the same address.
>> Maybe add a bool to the record structure to indicate if it’s an allocate or 
>> free.
>>
>> //Eelco
>>
>>> gdb shows such data structure:
>>> 2232 ukey_free_buffer.index = (ukey_free_buffer.index + 1) % (1024
>>> * 1024);  // Circular buffer
>>> (gdb) p ukey_free_buffer
>>> $1 = {
>>>   records = {{
>>>   ukey_addr = 0x7f8a0d871700,
>>>   timestamp = 1709003328
>>> }, {
>>>   ukey_addr = 0x7f8a0f969120,
>>>   timestamp = 1709003365
>>> }, {
>>>   ukey_addr = 0x7f8a0defe190,
>>>   timestamp = 1709003393
>>> }, {
>>>   ukey_addr = 0x7f8a0984aea0,
>>>   timestamp = 1709003452
>>> }...},
>>>   index = 3,
>>>   mutex = {
>>> lock = {
>>>   __data = {
>>> __lock = 1,
>>> __count = 0,
>>> __owner = 45210,
>>> __nusers = 1,
>>> __kind = 2,
>>> __spins = 0,
>>> __elision = 0,
>>> __list = {
>>>   __prev = 0x0,
>>>   __next = 0x0
>>> }
>>>   },
>>>   __size = 
>>> "\001\000\000\000\000\000\000\000\232\260\000\000\001\000\000\000\002",
>>> '\000' ,
>>>   __align = 1
>>> },
>>> where = 0x55c35a347d18 "ofproto/ofproto-dpif-upcall.c:2229"
>>>   }
>>> }
>>>
>>> and counter_upcall_ukey_free is:
>>> $2 = {name = 0x5622b448f612 "upcall_ukey_free", count = 0x5622b41047f0
>>> , total = 79785, last_total = 79785, min = {0,
>>> 0, 0, 0, 0, 55, 22681, 11703, 13877, 12750, 0, 18719}, hr = {79785,
>>> 0 }}
>>>
>>> Let's see how this goes.
>>>
>>> Thank you.
>>>
>>> On Tue, Feb 27, 2024 at 9:05 AM LIU Yulong  wrote:

 @Ilya, thank you, I will add that patch.

 @Eelco, thank you again, I will add a RL log to the free(ukey). Hope
 we can get something useful.


 On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets  wrote:
>
> On 2/26/24 11:20, Eelco Chaudron wrote:
>>
>>
>> On 26 Feb 2024, at 11:10, LIU Yulong wrote:
>>
>>> Hi Eelco,
>>>
>>> Thank you for the quick response.
>>>
>>> I did not add those logs, because in order to reproduce the issue, we
>>> have to send lots of packets to the host.
>>> So there are too many ukeys created/deleted to do logging.
>>
>> Maybe a circular buffer with all alloc/free (+ 1ukey address, and 
>> timestamp), 1 or 2 Mb of memory can hold a lot.
>>
>>> And can we ensure that this [1] is the only place for ovs to free the 
>>> ukey?
>>>
>>> [1] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084
>>
>> Yes, this should be the only place, and should always be done through an 
>> RCU delayed delete.
>>
>>> For last mail, can this issue be concurrent read-and-update/delete?
>>> The revalidator_sweep__ is trying to lock the ukey->mutex, while
>>> another thread is updating the ukey->mutex to NULL and free ukey.
>>
>> This should not happen as the delete should happen by the delayed RCU 
>> delete, and if the ukey is still in the cmap after the delayed delete 
>> (quiescent state) something is wrong.
>
>
> I agree with Eelco and I don't see any abvious issues with the current
> implementation.
>
> However, the ususal suspect for RCU problems is entering quiescent state
> while iterating RCU-protected structure.  Though I'm not sure how that can
> happen in the revalidator, usually such issues are hiding somewhere way
> down the call stack.  I made a small patch that can help to be sure that
> this doesn't actually happen in your setup:
> https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
>
> Could you try it?
>
> The change will log an error message and abort the 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-27 Thread LIU Yulong via discuss
Yes, that makes sense.

Another question is how to distinguish the core at line of
ovs_mutex_trylock in revalidator_sweep__ is after the free(ukey),
since the core trace has no timestamp.

This line in the function 'ukey_create__' should be the only place
where ovs allocated the memory for ukey:
https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-upcall.c#L1777

Right?

If it is true, I will update the buffer structure, and a
counter_upcall_ukey_allocate as well.

On Tue, Feb 27, 2024 at 3:34 PM Eelco Chaudron  wrote:
>
>
>
> On 27 Feb 2024, at 4:44, LIU Yulong wrote:
>
> > @Eelco, as you suggested, added such circular buffer to my local OVS:
> > https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46
>
> I should also add allocate logging, or else you might not know if a buffer 
> was allocated at the same address.
> Maybe add a bool to the record structure to indicate if it’s an allocate or 
> free.
>
> //Eelco
>
> > gdb shows such data structure:
> > 2232 ukey_free_buffer.index = (ukey_free_buffer.index + 1) % (1024
> > * 1024);  // Circular buffer
> > (gdb) p ukey_free_buffer
> > $1 = {
> >   records = {{
> >   ukey_addr = 0x7f8a0d871700,
> >   timestamp = 1709003328
> > }, {
> >   ukey_addr = 0x7f8a0f969120,
> >   timestamp = 1709003365
> > }, {
> >   ukey_addr = 0x7f8a0defe190,
> >   timestamp = 1709003393
> > }, {
> >   ukey_addr = 0x7f8a0984aea0,
> >   timestamp = 1709003452
> > }...},
> >   index = 3,
> >   mutex = {
> > lock = {
> >   __data = {
> > __lock = 1,
> > __count = 0,
> > __owner = 45210,
> > __nusers = 1,
> > __kind = 2,
> > __spins = 0,
> > __elision = 0,
> > __list = {
> >   __prev = 0x0,
> >   __next = 0x0
> > }
> >   },
> >   __size = 
> > "\001\000\000\000\000\000\000\000\232\260\000\000\001\000\000\000\002",
> > '\000' ,
> >   __align = 1
> > },
> > where = 0x55c35a347d18 "ofproto/ofproto-dpif-upcall.c:2229"
> >   }
> > }
> >
> > and counter_upcall_ukey_free is:
> > $2 = {name = 0x5622b448f612 "upcall_ukey_free", count = 0x5622b41047f0
> > , total = 79785, last_total = 79785, min = {0,
> > 0, 0, 0, 0, 55, 22681, 11703, 13877, 12750, 0, 18719}, hr = {79785,
> > 0 }}
> >
> > Let's see how this goes.
> >
> > Thank you.
> >
> > On Tue, Feb 27, 2024 at 9:05 AM LIU Yulong  wrote:
> >>
> >> @Ilya, thank you, I will add that patch.
> >>
> >> @Eelco, thank you again, I will add a RL log to the free(ukey). Hope
> >> we can get something useful.
> >>
> >>
> >> On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets  wrote:
> >>>
> >>> On 2/26/24 11:20, Eelco Chaudron wrote:
> 
> 
>  On 26 Feb 2024, at 11:10, LIU Yulong wrote:
> 
> > Hi Eelco,
> >
> > Thank you for the quick response.
> >
> > I did not add those logs, because in order to reproduce the issue, we
> > have to send lots of packets to the host.
> > So there are too many ukeys created/deleted to do logging.
> 
>  Maybe a circular buffer with all alloc/free (+ 1ukey address, and 
>  timestamp), 1 or 2 Mb of memory can hold a lot.
> 
> > And can we ensure that this [1] is the only place for ovs to free the 
> > ukey?
> >
> > [1] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084
> 
>  Yes, this should be the only place, and should always be done through an 
>  RCU delayed delete.
> 
> > For last mail, can this issue be concurrent read-and-update/delete?
> > The revalidator_sweep__ is trying to lock the ukey->mutex, while
> > another thread is updating the ukey->mutex to NULL and free ukey.
> 
>  This should not happen as the delete should happen by the delayed RCU 
>  delete, and if the ukey is still in the cmap after the delayed delete 
>  (quiescent state) something is wrong.
> >>>
> >>>
> >>> I agree with Eelco and I don't see any abvious issues with the current
> >>> implementation.
> >>>
> >>> However, the ususal suspect for RCU problems is entering quiescent state
> >>> while iterating RCU-protected structure.  Though I'm not sure how that can
> >>> happen in the revalidator, usually such issues are hiding somewhere way
> >>> down the call stack.  I made a small patch that can help to be sure that
> >>> this doesn't actually happen in your setup:
> >>>  
> >>> https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
> >>>
> >>> Could you try it?
> >>>
> >>> The change will log an error message and abort the process if we happen
> >>> to enter quiescent state while iterating over the hash map.  Core dump
> >>> will point to a problematic call.
> >>>
> >>> Best regards, Ilya Maximets.
> >>>
> 
> > LIU Yulong
> >
> >
> > On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  
> > wrote:
> >>
> >>
> >>
> >> On 26 Feb 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss


On 27 Feb 2024, at 4:44, LIU Yulong wrote:

> @Eelco, as you suggested, added such circular buffer to my local OVS:
> https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46

I should also add allocate logging, or else you might not know if a buffer was 
allocated at the same address.
Maybe add a bool to the record structure to indicate if it’s an allocate or 
free.

//Eelco

> gdb shows such data structure:
> 2232 ukey_free_buffer.index = (ukey_free_buffer.index + 1) % (1024
> * 1024);  // Circular buffer
> (gdb) p ukey_free_buffer
> $1 = {
>   records = {{
>   ukey_addr = 0x7f8a0d871700,
>   timestamp = 1709003328
> }, {
>   ukey_addr = 0x7f8a0f969120,
>   timestamp = 1709003365
> }, {
>   ukey_addr = 0x7f8a0defe190,
>   timestamp = 1709003393
> }, {
>   ukey_addr = 0x7f8a0984aea0,
>   timestamp = 1709003452
> }...},
>   index = 3,
>   mutex = {
> lock = {
>   __data = {
> __lock = 1,
> __count = 0,
> __owner = 45210,
> __nusers = 1,
> __kind = 2,
> __spins = 0,
> __elision = 0,
> __list = {
>   __prev = 0x0,
>   __next = 0x0
> }
>   },
>   __size = 
> "\001\000\000\000\000\000\000\000\232\260\000\000\001\000\000\000\002",
> '\000' ,
>   __align = 1
> },
> where = 0x55c35a347d18 "ofproto/ofproto-dpif-upcall.c:2229"
>   }
> }
>
> and counter_upcall_ukey_free is:
> $2 = {name = 0x5622b448f612 "upcall_ukey_free", count = 0x5622b41047f0
> , total = 79785, last_total = 79785, min = {0,
> 0, 0, 0, 0, 55, 22681, 11703, 13877, 12750, 0, 18719}, hr = {79785,
> 0 }}
>
> Let's see how this goes.
>
> Thank you.
>
> On Tue, Feb 27, 2024 at 9:05 AM LIU Yulong  wrote:
>>
>> @Ilya, thank you, I will add that patch.
>>
>> @Eelco, thank you again, I will add a RL log to the free(ukey). Hope
>> we can get something useful.
>>
>>
>> On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets  wrote:
>>>
>>> On 2/26/24 11:20, Eelco Chaudron wrote:


 On 26 Feb 2024, at 11:10, LIU Yulong wrote:

> Hi Eelco,
>
> Thank you for the quick response.
>
> I did not add those logs, because in order to reproduce the issue, we
> have to send lots of packets to the host.
> So there are too many ukeys created/deleted to do logging.

 Maybe a circular buffer with all alloc/free (+ 1ukey address, and 
 timestamp), 1 or 2 Mb of memory can hold a lot.

> And can we ensure that this [1] is the only place for ovs to free the 
> ukey?
>
> [1] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084

 Yes, this should be the only place, and should always be done through an 
 RCU delayed delete.

> For last mail, can this issue be concurrent read-and-update/delete?
> The revalidator_sweep__ is trying to lock the ukey->mutex, while
> another thread is updating the ukey->mutex to NULL and free ukey.

 This should not happen as the delete should happen by the delayed RCU 
 delete, and if the ukey is still in the cmap after the delayed delete 
 (quiescent state) something is wrong.
>>>
>>>
>>> I agree with Eelco and I don't see any abvious issues with the current
>>> implementation.
>>>
>>> However, the ususal suspect for RCU problems is entering quiescent state
>>> while iterating RCU-protected structure.  Though I'm not sure how that can
>>> happen in the revalidator, usually such issues are hiding somewhere way
>>> down the call stack.  I made a small patch that can help to be sure that
>>> this doesn't actually happen in your setup:
>>>  
>>> https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
>>>
>>> Could you try it?
>>>
>>> The change will log an error message and abort the process if we happen
>>> to enter quiescent state while iterating over the hash map.  Core dump
>>> will point to a problematic call.
>>>
>>> Best regards, Ilya Maximets.
>>>

> LIU Yulong
>
>
> On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  
> wrote:
>>
>>
>>
>> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
>>
>>> Hi,
>>>
>>> I have read the code by comparing the call stack of the core files
>>> carefully, and found
>>> a potential race condition. Please confirm whether the following 3 
>>> threads
>>> have a race condition. Just did some code trace, can such
>>> race condition happen?
>>>
>>> * PMD thread1 ===:
>>> -> pmd_thread_main
>>> -> dp_netdev_process_rxq_port
>>> -> dp_netdev_input
>>> -> dp_netdev_input__
>>> -> handle_packet_upcall
>>> -> dp_netdev_upcall
>>> -> upcall_cb
>>> -> ukey_install
>>> -> ukey_install__
>>> -> try_ukey_replace:
>>> ovs_mutex_lock(_ukey->mutex);
>>> <-- the CMAP_FOR_EACH loop in the 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
@Eelco, as you suggested, added such circular buffer to my local OVS:
https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46

gdb shows such data structure:
2232 ukey_free_buffer.index = (ukey_free_buffer.index + 1) % (1024
* 1024);  // Circular buffer
(gdb) p ukey_free_buffer
$1 = {
  records = {{
  ukey_addr = 0x7f8a0d871700,
  timestamp = 1709003328
}, {
  ukey_addr = 0x7f8a0f969120,
  timestamp = 1709003365
}, {
  ukey_addr = 0x7f8a0defe190,
  timestamp = 1709003393
}, {
  ukey_addr = 0x7f8a0984aea0,
  timestamp = 1709003452
}...},
  index = 3,
  mutex = {
lock = {
  __data = {
__lock = 1,
__count = 0,
__owner = 45210,
__nusers = 1,
__kind = 2,
__spins = 0,
__elision = 0,
__list = {
  __prev = 0x0,
  __next = 0x0
}
  },
  __size = 
"\001\000\000\000\000\000\000\000\232\260\000\000\001\000\000\000\002",
'\000' ,
  __align = 1
},
where = 0x55c35a347d18 "ofproto/ofproto-dpif-upcall.c:2229"
  }
}

and counter_upcall_ukey_free is:
$2 = {name = 0x5622b448f612 "upcall_ukey_free", count = 0x5622b41047f0
, total = 79785, last_total = 79785, min = {0,
0, 0, 0, 0, 55, 22681, 11703, 13877, 12750, 0, 18719}, hr = {79785,
0 }}

Let's see how this goes.

Thank you.

On Tue, Feb 27, 2024 at 9:05 AM LIU Yulong  wrote:
>
> @Ilya, thank you, I will add that patch.
>
> @Eelco, thank you again, I will add a RL log to the free(ukey). Hope
> we can get something useful.
>
>
> On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets  wrote:
> >
> > On 2/26/24 11:20, Eelco Chaudron wrote:
> > >
> > >
> > > On 26 Feb 2024, at 11:10, LIU Yulong wrote:
> > >
> > >> Hi Eelco,
> > >>
> > >> Thank you for the quick response.
> > >>
> > >> I did not add those logs, because in order to reproduce the issue, we
> > >> have to send lots of packets to the host.
> > >> So there are too many ukeys created/deleted to do logging.
> > >
> > > Maybe a circular buffer with all alloc/free (+ 1ukey address, and 
> > > timestamp), 1 or 2 Mb of memory can hold a lot.
> > >
> > >> And can we ensure that this [1] is the only place for ovs to free the 
> > >> ukey?
> > >>
> > >> [1] 
> > >> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084
> > >
> > > Yes, this should be the only place, and should always be done through an 
> > > RCU delayed delete.
> > >
> > >> For last mail, can this issue be concurrent read-and-update/delete?
> > >> The revalidator_sweep__ is trying to lock the ukey->mutex, while
> > >> another thread is updating the ukey->mutex to NULL and free ukey.
> > >
> > > This should not happen as the delete should happen by the delayed RCU 
> > > delete, and if the ukey is still in the cmap after the delayed delete 
> > > (quiescent state) something is wrong.
> >
> >
> > I agree with Eelco and I don't see any abvious issues with the current
> > implementation.
> >
> > However, the ususal suspect for RCU problems is entering quiescent state
> > while iterating RCU-protected structure.  Though I'm not sure how that can
> > happen in the revalidator, usually such issues are hiding somewhere way
> > down the call stack.  I made a small patch that can help to be sure that
> > this doesn't actually happen in your setup:
> >   
> > https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
> >
> > Could you try it?
> >
> > The change will log an error message and abort the process if we happen
> > to enter quiescent state while iterating over the hash map.  Core dump
> > will point to a problematic call.
> >
> > Best regards, Ilya Maximets.
> >
> > >
> > >> LIU Yulong
> > >>
> > >>
> > >> On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  
> > >> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
> > >>>
> >  Hi,
> > 
> >  I have read the code by comparing the call stack of the core files
> >  carefully, and found
> >  a potential race condition. Please confirm whether the following 3 
> >  threads
> >  have a race condition. Just did some code trace, can such
> >  race condition happen?
> > 
> >  * PMD thread1 ===:
> >  -> pmd_thread_main
> >  -> dp_netdev_process_rxq_port
> >  -> dp_netdev_input
> >  -> dp_netdev_input__
> >  -> handle_packet_upcall
> >  -> dp_netdev_upcall
> >  -> upcall_cb
> >  -> ukey_install
> >  -> ukey_install__
> >  -> try_ukey_replace:
> >  ovs_mutex_lock(_ukey->mutex);
> >  <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
> >  bit earlier than the cmap_replace next line, so the old_ukey can be
> >  iterated. [1]
> >  cmap_replace(>cmap, _ukey->cmap_node,
> >   _ukey->cmap_node, new_ukey->hash);
> >  ovsrcu_postpone(ukey_delete__, old_ukey);

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
@Ilya, thank you, I will add that patch.

@Eelco, thank you again, I will add a RL log to the free(ukey). Hope
we can get something useful.


On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets  wrote:
>
> On 2/26/24 11:20, Eelco Chaudron wrote:
> >
> >
> > On 26 Feb 2024, at 11:10, LIU Yulong wrote:
> >
> >> Hi Eelco,
> >>
> >> Thank you for the quick response.
> >>
> >> I did not add those logs, because in order to reproduce the issue, we
> >> have to send lots of packets to the host.
> >> So there are too many ukeys created/deleted to do logging.
> >
> > Maybe a circular buffer with all alloc/free (+ 1ukey address, and 
> > timestamp), 1 or 2 Mb of memory can hold a lot.
> >
> >> And can we ensure that this [1] is the only place for ovs to free the ukey?
> >>
> >> [1] 
> >> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084
> >
> > Yes, this should be the only place, and should always be done through an 
> > RCU delayed delete.
> >
> >> For last mail, can this issue be concurrent read-and-update/delete?
> >> The revalidator_sweep__ is trying to lock the ukey->mutex, while
> >> another thread is updating the ukey->mutex to NULL and free ukey.
> >
> > This should not happen as the delete should happen by the delayed RCU 
> > delete, and if the ukey is still in the cmap after the delayed delete 
> > (quiescent state) something is wrong.
>
>
> I agree with Eelco and I don't see any abvious issues with the current
> implementation.
>
> However, the ususal suspect for RCU problems is entering quiescent state
> while iterating RCU-protected structure.  Though I'm not sure how that can
> happen in the revalidator, usually such issues are hiding somewhere way
> down the call stack.  I made a small patch that can help to be sure that
> this doesn't actually happen in your setup:
>   
> https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5
>
> Could you try it?
>
> The change will log an error message and abort the process if we happen
> to enter quiescent state while iterating over the hash map.  Core dump
> will point to a problematic call.
>
> Best regards, Ilya Maximets.
>
> >
> >> LIU Yulong
> >>
> >>
> >> On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  wrote:
> >>>
> >>>
> >>>
> >>> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
> >>>
>  Hi,
> 
>  I have read the code by comparing the call stack of the core files
>  carefully, and found
>  a potential race condition. Please confirm whether the following 3 
>  threads
>  have a race condition. Just did some code trace, can such
>  race condition happen?
> 
>  * PMD thread1 ===:
>  -> pmd_thread_main
>  -> dp_netdev_process_rxq_port
>  -> dp_netdev_input
>  -> dp_netdev_input__
>  -> handle_packet_upcall
>  -> dp_netdev_upcall
>  -> upcall_cb
>  -> ukey_install
>  -> ukey_install__
>  -> try_ukey_replace:
>  ovs_mutex_lock(_ukey->mutex);
>  <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
>  bit earlier than the cmap_replace next line, so the old_ukey can be
>  iterated. [1]
>  cmap_replace(>cmap, _ukey->cmap_node,
>   _ukey->cmap_node, new_ukey->hash);
>  ovsrcu_postpone(ukey_delete__, old_ukey);
>  < delete the ukey asynchronously. [2]
>  transition_ukey(old_ukey, UKEY_DELETED);   <
>  transition the ukey state to UKEY_DELETED, most core files show that
>  the ukey last state change was at this line. [3]
>  transition_ukey(new_ukey, UKEY_VISIBLE);
> 
>  [1] 
>  https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
>  [2] 
>  https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
>  [3] 
>  https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
> 
>  This function try_ukey_replace was called many times, because the
>  `counter_upcall_ukey_replace` is not zero.
>  For instance:
>  {
>    name = 0x55ba9755206b "upcall_ukey_replace",
>    count = 0x55ba971c7610 ,
>    total = 2287997,
>    last_total = 2287997,
>    min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
>    hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
>  3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
>  3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
>  3659, 4007, 3536,
>  3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
>  3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
>  4237, 3667}
>  }
> 
>  * RCU thread2 ===:
>  -> ovsrcu_postpone_thread
>  -> ovsrcu_call_postponed
>  -> ukey_delete__

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Ilya Maximets via discuss
On 2/26/24 11:20, Eelco Chaudron wrote:
> 
> 
> On 26 Feb 2024, at 11:10, LIU Yulong wrote:
> 
>> Hi Eelco,
>>
>> Thank you for the quick response.
>>
>> I did not add those logs, because in order to reproduce the issue, we
>> have to send lots of packets to the host.
>> So there are too many ukeys created/deleted to do logging.
> 
> Maybe a circular buffer with all alloc/free (+ 1ukey address, and timestamp), 
> 1 or 2 Mb of memory can hold a lot.
> 
>> And can we ensure that this [1] is the only place for ovs to free the ukey?
>>
>> [1] 
>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084
> 
> Yes, this should be the only place, and should always be done through an RCU 
> delayed delete.
> 
>> For last mail, can this issue be concurrent read-and-update/delete?
>> The revalidator_sweep__ is trying to lock the ukey->mutex, while
>> another thread is updating the ukey->mutex to NULL and free ukey.
> 
> This should not happen as the delete should happen by the delayed RCU delete, 
> and if the ukey is still in the cmap after the delayed delete (quiescent 
> state) something is wrong.


I agree with Eelco and I don't see any abvious issues with the current
implementation.

However, the ususal suspect for RCU problems is entering quiescent state
while iterating RCU-protected structure.  Though I'm not sure how that can
happen in the revalidator, usually such issues are hiding somewhere way
down the call stack.  I made a small patch that can help to be sure that
this doesn't actually happen in your setup:
  https://github.com/igsilya/ovs/commit/8268347a159b5afa884f5b3008897878b5b520f5

Could you try it?

The change will log an error message and abort the process if we happen
to enter quiescent state while iterating over the hash map.  Core dump
will point to a problematic call.

Best regards, Ilya Maximets.

> 
>> LIU Yulong
>>
>>
>> On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  wrote:
>>>
>>>
>>>
>>> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
>>>
 Hi,

 I have read the code by comparing the call stack of the core files
 carefully, and found
 a potential race condition. Please confirm whether the following 3 threads
 have a race condition. Just did some code trace, can such
 race condition happen?

 * PMD thread1 ===:
 -> pmd_thread_main
 -> dp_netdev_process_rxq_port
 -> dp_netdev_input
 -> dp_netdev_input__
 -> handle_packet_upcall
 -> dp_netdev_upcall
 -> upcall_cb
 -> ukey_install
 -> ukey_install__
 -> try_ukey_replace:
 ovs_mutex_lock(_ukey->mutex);
 <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
 bit earlier than the cmap_replace next line, so the old_ukey can be
 iterated. [1]
 cmap_replace(>cmap, _ukey->cmap_node,
  _ukey->cmap_node, new_ukey->hash);
 ovsrcu_postpone(ukey_delete__, old_ukey);
 < delete the ukey asynchronously. [2]
 transition_ukey(old_ukey, UKEY_DELETED);   <
 transition the ukey state to UKEY_DELETED, most core files show that
 the ukey last state change was at this line. [3]
 transition_ukey(new_ukey, UKEY_VISIBLE);

 [1] 
 https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
 [2] 
 https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
 [3] 
 https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897

 This function try_ukey_replace was called many times, because the
 `counter_upcall_ukey_replace` is not zero.
 For instance:
 {
   name = 0x55ba9755206b "upcall_ukey_replace",
   count = 0x55ba971c7610 ,
   total = 2287997,
   last_total = 2287997,
   min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
   hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
 3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
 3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
 3659, 4007, 3536,
 3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
 3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
 4237, 3667}
 }

 * RCU thread2 ===:
 -> ovsrcu_postpone_thread
 -> ovsrcu_call_postponed
 -> ukey_delete__< This
 function is not thead safe IMO, it has mark
 OVS_NO_THREAD_SAFETY_ANALYSIS. [4]

 recirc_refs_unref(>recircs);
 xlate_cache_delete(ukey->xcache);
 ofpbuf_delete(ovsrcu_get(struct ofpbuf *, >actions));
 ovs_mutex_destroy(>mutex); <-- Just
 set ukey mutex to NULL. [5][6][7]
 free(ukey);

 [4] 
 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss


On 26 Feb 2024, at 11:10, LIU Yulong wrote:

> Hi Eelco,
>
> Thank you for the quick response.
>
> I did not add those logs, because in order to reproduce the issue, we
> have to send lots of packets to the host.
> So there are too many ukeys created/deleted to do logging.

Maybe a circular buffer with all alloc/free (+ 1ukey address, and timestamp), 1 
or 2 Mb of memory can hold a lot.

> And can we ensure that this [1] is the only place for ovs to free the ukey?
>
> [1] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084

Yes, this should be the only place, and should always be done through an RCU 
delayed delete.

> For last mail, can this issue be concurrent read-and-update/delete?
> The revalidator_sweep__ is trying to lock the ukey->mutex, while
> another thread is updating the ukey->mutex to NULL and free ukey.

This should not happen as the delete should happen by the delayed RCU delete, 
and if the ukey is still in the cmap after the delayed delete (quiescent state) 
something is wrong.

> LIU Yulong
>
>
> On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  wrote:
>>
>>
>>
>> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
>>
>>> Hi,
>>>
>>> I have read the code by comparing the call stack of the core files
>>> carefully, and found
>>> a potential race condition. Please confirm whether the following 3 threads
>>> have a race condition. Just did some code trace, can such
>>> race condition happen?
>>>
>>> * PMD thread1 ===:
>>> -> pmd_thread_main
>>> -> dp_netdev_process_rxq_port
>>> -> dp_netdev_input
>>> -> dp_netdev_input__
>>> -> handle_packet_upcall
>>> -> dp_netdev_upcall
>>> -> upcall_cb
>>> -> ukey_install
>>> -> ukey_install__
>>> -> try_ukey_replace:
>>> ovs_mutex_lock(_ukey->mutex);
>>> <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
>>> bit earlier than the cmap_replace next line, so the old_ukey can be
>>> iterated. [1]
>>> cmap_replace(>cmap, _ukey->cmap_node,
>>>  _ukey->cmap_node, new_ukey->hash);
>>> ovsrcu_postpone(ukey_delete__, old_ukey);
>>> < delete the ukey asynchronously. [2]
>>> transition_ukey(old_ukey, UKEY_DELETED);   <
>>> transition the ukey state to UKEY_DELETED, most core files show that
>>> the ukey last state change was at this line. [3]
>>> transition_ukey(new_ukey, UKEY_VISIBLE);
>>>
>>> [1] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
>>> [2] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
>>> [3] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
>>>
>>> This function try_ukey_replace was called many times, because the
>>> `counter_upcall_ukey_replace` is not zero.
>>> For instance:
>>> {
>>>   name = 0x55ba9755206b "upcall_ukey_replace",
>>>   count = 0x55ba971c7610 ,
>>>   total = 2287997,
>>>   last_total = 2287997,
>>>   min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
>>>   hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
>>> 3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
>>> 3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
>>> 3659, 4007, 3536,
>>> 3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
>>> 3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
>>> 4237, 3667}
>>> }
>>>
>>> * RCU thread2 ===:
>>> -> ovsrcu_postpone_thread
>>> -> ovsrcu_call_postponed
>>> -> ukey_delete__< This
>>> function is not thead safe IMO, it has mark
>>> OVS_NO_THREAD_SAFETY_ANALYSIS. [4]
>>>
>>> recirc_refs_unref(>recircs);
>>> xlate_cache_delete(ukey->xcache);
>>> ofpbuf_delete(ovsrcu_get(struct ofpbuf *, >actions));
>>> ovs_mutex_destroy(>mutex); <-- Just
>>> set ukey mutex to NULL. [5][6][7]
>>> free(ukey);
>>>
>>> [4] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2074
>>> [5] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2083
>>> [6] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L131
>>> [7] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L124
>>>
>>> * revalidator thread3 ===:
>>>
>>> -> udpif_revalidator
>>> -> revalidator_sweep
>>> -> revalidator_sweep__
>>>
>>> CMAP_FOR_EACH(ukey, cmap_node, >cmap) {
>>> enum ukey_state ukey_state;
>>>
>>> if (ovs_mutex_trylock(>mutex)) {  <--
>>> Core at here, because of the NULL pointer. [8]
>>> continue;
>>> }
>>> [8] 
>>> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2900
>>>
>>> CMIIW, if this race condition can happen, IMO, it is mostly 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
Hi Eelco,

Thank you for the quick response.

I did not add those logs, because in order to reproduce the issue, we
have to send lots of packets to the host.
So there are too many ukeys created/deleted to do logging.

And can we ensure that this [1] is the only place for ovs to free the ukey?

[1] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2084

For last mail, can this issue be concurrent read-and-update/delete?
The revalidator_sweep__ is trying to lock the ukey->mutex, while
another thread is updating the ukey->mutex to NULL and free ukey.

LIU Yulong


On Mon, Feb 26, 2024 at 5:41 PM Eelco Chaudron  wrote:
>
>
>
> On 26 Feb 2024, at 9:33, LIU Yulong wrote:
>
> > Hi,
> >
> > I have read the code by comparing the call stack of the core files
> > carefully, and found
> > a potential race condition. Please confirm whether the following 3 threads
> > have a race condition. Just did some code trace, can such
> > race condition happen?
> >
> > * PMD thread1 ===:
> > -> pmd_thread_main
> > -> dp_netdev_process_rxq_port
> > -> dp_netdev_input
> > -> dp_netdev_input__
> > -> handle_packet_upcall
> > -> dp_netdev_upcall
> > -> upcall_cb
> > -> ukey_install
> > -> ukey_install__
> > -> try_ukey_replace:
> > ovs_mutex_lock(_ukey->mutex);
> > <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
> > bit earlier than the cmap_replace next line, so the old_ukey can be
> > iterated. [1]
> > cmap_replace(>cmap, _ukey->cmap_node,
> >  _ukey->cmap_node, new_ukey->hash);
> > ovsrcu_postpone(ukey_delete__, old_ukey);
> > < delete the ukey asynchronously. [2]
> > transition_ukey(old_ukey, UKEY_DELETED);   <
> > transition the ukey state to UKEY_DELETED, most core files show that
> > the ukey last state change was at this line. [3]
> > transition_ukey(new_ukey, UKEY_VISIBLE);
> >
> > [1] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
> > [2] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
> > [3] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
> >
> > This function try_ukey_replace was called many times, because the
> > `counter_upcall_ukey_replace` is not zero.
> > For instance:
> > {
> >   name = 0x55ba9755206b "upcall_ukey_replace",
> >   count = 0x55ba971c7610 ,
> >   total = 2287997,
> >   last_total = 2287997,
> >   min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
> >   hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
> > 3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
> > 3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
> > 3659, 4007, 3536,
> > 3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
> > 3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
> > 4237, 3667}
> > }
> >
> > * RCU thread2 ===:
> > -> ovsrcu_postpone_thread
> > -> ovsrcu_call_postponed
> > -> ukey_delete__< This
> > function is not thead safe IMO, it has mark
> > OVS_NO_THREAD_SAFETY_ANALYSIS. [4]
> >
> > recirc_refs_unref(>recircs);
> > xlate_cache_delete(ukey->xcache);
> > ofpbuf_delete(ovsrcu_get(struct ofpbuf *, >actions));
> > ovs_mutex_destroy(>mutex); <-- Just
> > set ukey mutex to NULL. [5][6][7]
> > free(ukey);
> >
> > [4] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2074
> > [5] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2083
> > [6] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L131
> > [7] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L124
> >
> > * revalidator thread3 ===:
> >
> > -> udpif_revalidator
> > -> revalidator_sweep
> > -> revalidator_sweep__
> >
> > CMAP_FOR_EACH(ukey, cmap_node, >cmap) {
> > enum ukey_state ukey_state;
> >
> > if (ovs_mutex_trylock(>mutex)) {  <--
> > Core at here, because of the NULL pointer. [8]
> > continue;
> > }
> > [8] 
> > https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2900
> >
> > CMIIW, if this race condition can happen, IMO, it is mostly because
> > the umap is not locked during the sweep CMAP_FOR_EACH loop.
> > Or some RCU protection did not work properly.
>
> The code looks fine to me, we are taking the lock when we update the cmap, 
> which is a requirement, however iterating over the cmap as done above does 
> not require a lock.
>
> If the RCU protection would have been broken, we would have seen a lot more 
> errors. Did you have any luck adding the traces I suggested earlier, to see 
> if it’s a use 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss


On 26 Feb 2024, at 9:33, LIU Yulong wrote:

> Hi,
>
> I have read the code by comparing the call stack of the core files
> carefully, and found
> a potential race condition. Please confirm whether the following 3 threads
> have a race condition. Just did some code trace, can such
> race condition happen?
>
> * PMD thread1 ===:
> -> pmd_thread_main
> -> dp_netdev_process_rxq_port
> -> dp_netdev_input
> -> dp_netdev_input__
> -> handle_packet_upcall
> -> dp_netdev_upcall
> -> upcall_cb
> -> ukey_install
> -> ukey_install__
> -> try_ukey_replace:
> ovs_mutex_lock(_ukey->mutex);
> <-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
> bit earlier than the cmap_replace next line, so the old_ukey can be
> iterated. [1]
> cmap_replace(>cmap, _ukey->cmap_node,
>  _ukey->cmap_node, new_ukey->hash);
> ovsrcu_postpone(ukey_delete__, old_ukey);
> < delete the ukey asynchronously. [2]
> transition_ukey(old_ukey, UKEY_DELETED);   <
> transition the ukey state to UKEY_DELETED, most core files show that
> the ukey last state change was at this line. [3]
> transition_ukey(new_ukey, UKEY_VISIBLE);
>
> [1] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
> [2] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
> [3] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897
>
> This function try_ukey_replace was called many times, because the
> `counter_upcall_ukey_replace` is not zero.
> For instance:
> {
>   name = 0x55ba9755206b "upcall_ukey_replace",
>   count = 0x55ba971c7610 ,
>   total = 2287997,
>   last_total = 2287997,
>   min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
>   hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
> 3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
> 3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
> 3659, 4007, 3536,
> 3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
> 3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
> 4237, 3667}
> }
>
> * RCU thread2 ===:
> -> ovsrcu_postpone_thread
> -> ovsrcu_call_postponed
> -> ukey_delete__< This
> function is not thead safe IMO, it has mark
> OVS_NO_THREAD_SAFETY_ANALYSIS. [4]
>
> recirc_refs_unref(>recircs);
> xlate_cache_delete(ukey->xcache);
> ofpbuf_delete(ovsrcu_get(struct ofpbuf *, >actions));
> ovs_mutex_destroy(>mutex); <-- Just
> set ukey mutex to NULL. [5][6][7]
> free(ukey);
>
> [4] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2074
> [5] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2083
> [6] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L131
> [7] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L124
>
> * revalidator thread3 ===:
>
> -> udpif_revalidator
> -> revalidator_sweep
> -> revalidator_sweep__
>
> CMAP_FOR_EACH(ukey, cmap_node, >cmap) {
> enum ukey_state ukey_state;
>
> if (ovs_mutex_trylock(>mutex)) {  <--
> Core at here, because of the NULL pointer. [8]
> continue;
> }
> [8] 
> https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2900
>
> CMIIW, if this race condition can happen, IMO, it is mostly because
> the umap is not locked during the sweep CMAP_FOR_EACH loop.
> Or some RCU protection did not work properly.

The code looks fine to me, we are taking the lock when we update the cmap, 
which is a requirement, however iterating over the cmap as done above does not 
require a lock.

If the RCU protection would have been broken, we would have seen a lot more 
errors. Did you have any luck adding the traces I suggested earlier, to see if 
it’s a use after free?

Cheers,

Eelco

> On Wed, Feb 21, 2024 at 6:40 PM Eelco Chaudron  wrote:
>>
>>
>>
>> On 21 Feb 2024, at 4:26, LIU Yulong wrote:
>>
>>> Thank you very much for your reply.
>>>
>>> The problem is not easy to reproduce, we have to wait a random long time to 
>>> see
>>> if the issue happens again. It can be more than one day or longer.
>>> OVS 2.17 with dpdk 20.11 had run to core before, so it's hard to say
>>> if it is related to DPDK.
>>> I'm running the ovs without offload to see if the issue can happen in
>>> recent days.
>>>
>>> And again, TLDR, paste more thread call stacks.
>>> Most of the threads are in the state of sched_yield, nanosleep,
>>> epoll_wait and  poll.
>>
>> If this looks like a memory trash issue, it might be hard to figure out. 
>> Does the ukey show any kind of pattern, i.e. does the trashed data look like 
>> anything 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
Hi,

I have read the code by comparing the call stack of the core files
carefully, and found
a potential race condition. Please confirm whether the following 3 threads
have a race condition. Just did some code trace, can such
race condition happen?

* PMD thread1 ===:
-> pmd_thread_main
-> dp_netdev_process_rxq_port
-> dp_netdev_input
-> dp_netdev_input__
-> handle_packet_upcall
-> dp_netdev_upcall
-> upcall_cb
-> ukey_install
-> ukey_install__
-> try_ukey_replace:
ovs_mutex_lock(_ukey->mutex);
<-- the CMAP_FOR_EACH loop in the revalidator_sweep__ run a
bit earlier than the cmap_replace next line, so the old_ukey can be
iterated. [1]
cmap_replace(>cmap, _ukey->cmap_node,
 _ukey->cmap_node, new_ukey->hash);
ovsrcu_postpone(ukey_delete__, old_ukey);
< delete the ukey asynchronously. [2]
transition_ukey(old_ukey, UKEY_DELETED);   <
transition the ukey state to UKEY_DELETED, most core files show that
the ukey last state change was at this line. [3]
transition_ukey(new_ukey, UKEY_VISIBLE);

[1] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1892
[2] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1896
[3] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L1897

This function try_ukey_replace was called many times, because the
`counter_upcall_ukey_replace` is not zero.
For instance:
{
  name = 0x55ba9755206b "upcall_ukey_replace",
  count = 0x55ba971c7610 ,
  total = 2287997,
  last_total = 2287997,
  min = {221, 247, 444, 278, 324, 570, 379, 464, 283, 280, 0, 427},
  hr = {3300, 4378, 3557, 4554, 3748, 3710, 4340, 3559, 4296, 3759,
3522, 4136, 3660, 4428, 3802, 3652, 3880, 3375, 4806, 4221, 4158,
3816, 3750, 3846, 3761, 3653, 4293, 3816, 3723, 3691, 4033, 468, 4117,
3659, 4007, 3536,
3439, 4440, 3388, 4079, 3876, 3865, 4339, 3757, 3481, 4027, 3989,
3633, 3737, 3564, 3403, 3992, 3793, 4390, 4124, 4354, 4164, 4383,
4237, 3667}
}

* RCU thread2 ===:
-> ovsrcu_postpone_thread
-> ovsrcu_call_postponed
-> ukey_delete__< This
function is not thead safe IMO, it has mark
OVS_NO_THREAD_SAFETY_ANALYSIS. [4]

recirc_refs_unref(>recircs);
xlate_cache_delete(ukey->xcache);
ofpbuf_delete(ovsrcu_get(struct ofpbuf *, >actions));
ovs_mutex_destroy(>mutex); <-- Just
set ukey mutex to NULL. [5][6][7]
free(ukey);

[4] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2074
[5] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2083
[6] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L131
[7] https://github.com/openvswitch/ovs/blob/v2.17.8/lib/ovs-thread.c#L124

* revalidator thread3 ===:

-> udpif_revalidator
-> revalidator_sweep
-> revalidator_sweep__

CMAP_FOR_EACH(ukey, cmap_node, >cmap) {
enum ukey_state ukey_state;

if (ovs_mutex_trylock(>mutex)) {  <--
Core at here, because of the NULL pointer. [8]
continue;
}
[8] 
https://github.com/openvswitch/ovs/blob/v2.17.8/ofproto/ofproto-dpif-upcall.c#L2900

CMIIW, if this race condition can happen, IMO, it is mostly because
the umap is not locked during the sweep CMAP_FOR_EACH loop.
Or some RCU protection did not work properly.


Regards,
LIU Yulong


On Wed, Feb 21, 2024 at 6:40 PM Eelco Chaudron  wrote:
>
>
>
> On 21 Feb 2024, at 4:26, LIU Yulong wrote:
>
> > Thank you very much for your reply.
> >
> > The problem is not easy to reproduce, we have to wait a random long time to 
> > see
> > if the issue happens again. It can be more than one day or longer.
> > OVS 2.17 with dpdk 20.11 had run to core before, so it's hard to say
> > if it is related to DPDK.
> > I'm running the ovs without offload to see if the issue can happen in
> > recent days.
> >
> > And again, TLDR, paste more thread call stacks.
> > Most of the threads are in the state of sched_yield, nanosleep,
> > epoll_wait and  poll.
>
> If this looks like a memory trash issue, it might be hard to figure out. Does 
> the ukey show any kind of pattern, i.e. does the trashed data look like 
> anything known?
> Maybe it’s a use after free, so you could add some debugging code 
> logging/recording all free and xmalloc of the ukey structure, to see that 
> when it crashes it was actually allocated?
>
> Hope this helps you getting started.
>
> //Eelco
>
> > The following threads are in working state. So hope this can have
> > clues for investigation.
> >
> > Thread 14 (Thread 0x7fd34002b700 (LWP 91928)):
> > #0  0x7fd344487b6d in recvmsg () at 
> > ../sysdeps/unix/syscall-template.S:81
> > #1  0x562773cb8d03 in mp_handle ()
> > #2  0x7fd344480e65 in start_thread 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-21 Thread Eelco Chaudron via discuss


On 21 Feb 2024, at 4:26, LIU Yulong wrote:

> Thank you very much for your reply.
>
> The problem is not easy to reproduce, we have to wait a random long time to 
> see
> if the issue happens again. It can be more than one day or longer.
> OVS 2.17 with dpdk 20.11 had run to core before, so it's hard to say
> if it is related to DPDK.
> I'm running the ovs without offload to see if the issue can happen in
> recent days.
>
> And again, TLDR, paste more thread call stacks.
> Most of the threads are in the state of sched_yield, nanosleep,
> epoll_wait and  poll.

If this looks like a memory trash issue, it might be hard to figure out. Does 
the ukey show any kind of pattern, i.e. does the trashed data look like 
anything known?
Maybe it’s a use after free, so you could add some debugging code 
logging/recording all free and xmalloc of the ukey structure, to see that when 
it crashes it was actually allocated?

Hope this helps you getting started.

//Eelco

> The following threads are in working state. So hope this can have
> clues for investigation.
>
> Thread 14 (Thread 0x7fd34002b700 (LWP 91928)):
> #0  0x7fd344487b6d in recvmsg () at ../sysdeps/unix/syscall-template.S:81
> #1  0x562773cb8d03 in mp_handle ()
> #2  0x7fd344480e65 in start_thread (arg=0x7fd34002b700) at
> pthread_create.c:307
> #3  0x7fd34260988d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
> Thread 13 (Thread 0x7fd3359d7700 (LWP 91929)):
> #0  0x7fd34448799d in accept () at ../sysdeps/unix/syscall-template.S:81
> #1  0x562773cd8f3c in socket_listener ()
> #2  0x7fd344480e65 in start_thread (arg=0x7fd3359d7700) at
> pthread_create.c:307
> #3  0x7fd34260988d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
> Thread 6 (Thread 0x7fd304663700 (LWP 91965)):
> #0  0x7fd34448771d in read () at ../sysdeps/unix/syscall-template.S:81
> #1  0x7fd343b42bfb in _mlx5dv_devx_get_event () from /lib64/libmlx5.so.1
> #2  0x562773936d86 in mlx5_vdpa_event_handle ()
> #3  0x7fd344480e65 in start_thread (arg=0x7fd304663700) at
> pthread_create.c:307
> #4  0x7fd34260988d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
> Thread 2 (Thread 0x7fd305730700 (LWP 91943)):
> #0  ccmap_find_slot_protected (count=,
> hash=hash@entry=1669671676, b=b@entry=0x7fd2f8012a80) at
> lib/ccmap.c:278
> #1  ccmap_inc_bucket_existing (b=b@entry=0x7fd2f8012a80,
> hash=hash@entry=1669671676, inc=inc@entry=1) at lib/ccmap.c:281
> #2  0x562773d4b015 in ccmap_try_inc
> (impl=impl@entry=0x7fd2f8012a40, hash=hash@entry=1669671676,
> inc=inc@entry=1) at lib/ccmap.c:464
> #3  0x562773d4b224 in ccmap_inc (ccmap=ccmap@entry=0x7fd2f802a7e8,
> hash=1669671676) at lib/ccmap.c:485
> #4  0x562773d4975a in classifier_replace (cls=,
> rule=rule@entry=0x7fd2fac70e28, version=,
> conjs=, n_conjs=)
> at lib/classifier.c:579
> #5  0x562773d49e99 in classifier_insert (cls=,
> rule=rule@entry=0x7fd2fac70e28, version=,
> conj=, n_conj=)
> at lib/classifier.c:694
> #6  0x562773d00fc8 in replace_rule_start
> (ofproto=ofproto@entry=0x5627778cc420, ofm=ofm@entry=0x7fd3057235f0,
> old_rule=, new_rule=new_rule@entry=0x7fd2fac70e20)
> at ofproto/ofproto.c:5645
> #7  0x562773d010e4 in add_flow_start (ofproto=0x5627778cc420,
> ofm=0x7fd3057235f0) at ofproto/ofproto.c:5256
> #8  0x562773d0122d in modify_flows_start__
> (ofproto=ofproto@entry=0x5627778cc420, ofm=ofm@entry=0x7fd3057235f0)
> at ofproto/ofproto.c:5824
> #9  0x562773d01eac in modify_flow_start_strict
> (ofm=0x7fd3057235f0, ofproto=0x5627778cc420) at ofproto/ofproto.c:5953
> #10 ofproto_flow_mod_start (ofproto=0x5627778cc420,
> ofm=ofm@entry=0x7fd3057235f0) at ofproto/ofproto.c:8112
> #11 0x562773d0225a in ofproto_flow_mod_learn_start
> (ofm=ofm@entry=0x7fd3057235f0) at ofproto/ofproto.c:5491
> #12 0x562773d040ad in ofproto_flow_mod_learn
> (ofm=ofm@entry=0x7fd3057235f0, keep_ref=,
> limit=, below_limitp=below_limitp@entry=0x7fd305723510)
> at ofproto/ofproto.c:5576
> #13 0x562773d2641e in xlate_learn_action
> (ctx=ctx@entry=0x7fd305729a60, learn=learn@entry=0x562777db4618) at
> ofproto/ofproto-dpif-xlate.c:5547
> #14 0x562773d2aafb in do_xlate_actions (ofpacts=,
> ofpacts_len=, ctx=0x7fd305729a60,
> is_last_action=, group_bucket_action=)
> at ofproto/ofproto-dpif-xlate.c:7232
> #15 0x562773d26c85 in xlate_recursively
> (actions_xlator=0x562773d29490 ,
> is_last_action=true, deepens=false, rule=0x562777db4470,
> ctx=0x7fd305729a60)
> at ofproto/ofproto-dpif-xlate.c:4383
> #16 xlate_table_action (ctx=0x7fd305729a60, in_port=,
> table_id=, may_packet_in=,
> honor_table_miss=, with_ct_orig=,
> is_last_action=true, xlator=0x562773d29490 ) at
> ofproto/ofproto-dpif-xlate.c:4512
> #17 0x562773d2ab8d in xlate_ofpact_resubmit
> (resubmit=0x56277781db28, resubmit=0x56277781db28,
> resubmit=0x56277781db28, is_last_action=true, ctx=0x7fd305729a60)
> at 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-20 Thread LIU Yulong via discuss
Thank you very much for your reply.

The problem is not easy to reproduce, we have to wait a random long time to see
if the issue happens again. It can be more than one day or longer.
OVS 2.17 with dpdk 20.11 had run to core before, so it's hard to say
if it is related to DPDK.
I'm running the ovs without offload to see if the issue can happen in
recent days.

And again, TLDR, paste more thread call stacks.
Most of the threads are in the state of sched_yield, nanosleep,
epoll_wait and  poll.

The following threads are in working state. So hope this can have
clues for investigation.

Thread 14 (Thread 0x7fd34002b700 (LWP 91928)):
#0  0x7fd344487b6d in recvmsg () at ../sysdeps/unix/syscall-template.S:81
#1  0x562773cb8d03 in mp_handle ()
#2  0x7fd344480e65 in start_thread (arg=0x7fd34002b700) at
pthread_create.c:307
#3  0x7fd34260988d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 13 (Thread 0x7fd3359d7700 (LWP 91929)):
#0  0x7fd34448799d in accept () at ../sysdeps/unix/syscall-template.S:81
#1  0x562773cd8f3c in socket_listener ()
#2  0x7fd344480e65 in start_thread (arg=0x7fd3359d7700) at
pthread_create.c:307
#3  0x7fd34260988d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7fd304663700 (LWP 91965)):
#0  0x7fd34448771d in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x7fd343b42bfb in _mlx5dv_devx_get_event () from /lib64/libmlx5.so.1
#2  0x562773936d86 in mlx5_vdpa_event_handle ()
#3  0x7fd344480e65 in start_thread (arg=0x7fd304663700) at
pthread_create.c:307
#4  0x7fd34260988d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7fd305730700 (LWP 91943)):
#0  ccmap_find_slot_protected (count=,
hash=hash@entry=1669671676, b=b@entry=0x7fd2f8012a80) at
lib/ccmap.c:278
#1  ccmap_inc_bucket_existing (b=b@entry=0x7fd2f8012a80,
hash=hash@entry=1669671676, inc=inc@entry=1) at lib/ccmap.c:281
#2  0x562773d4b015 in ccmap_try_inc
(impl=impl@entry=0x7fd2f8012a40, hash=hash@entry=1669671676,
inc=inc@entry=1) at lib/ccmap.c:464
#3  0x562773d4b224 in ccmap_inc (ccmap=ccmap@entry=0x7fd2f802a7e8,
hash=1669671676) at lib/ccmap.c:485
#4  0x562773d4975a in classifier_replace (cls=,
rule=rule@entry=0x7fd2fac70e28, version=,
conjs=, n_conjs=)
at lib/classifier.c:579
#5  0x562773d49e99 in classifier_insert (cls=,
rule=rule@entry=0x7fd2fac70e28, version=,
conj=, n_conj=)
at lib/classifier.c:694
#6  0x562773d00fc8 in replace_rule_start
(ofproto=ofproto@entry=0x5627778cc420, ofm=ofm@entry=0x7fd3057235f0,
old_rule=, new_rule=new_rule@entry=0x7fd2fac70e20)
at ofproto/ofproto.c:5645
#7  0x562773d010e4 in add_flow_start (ofproto=0x5627778cc420,
ofm=0x7fd3057235f0) at ofproto/ofproto.c:5256
#8  0x562773d0122d in modify_flows_start__
(ofproto=ofproto@entry=0x5627778cc420, ofm=ofm@entry=0x7fd3057235f0)
at ofproto/ofproto.c:5824
#9  0x562773d01eac in modify_flow_start_strict
(ofm=0x7fd3057235f0, ofproto=0x5627778cc420) at ofproto/ofproto.c:5953
#10 ofproto_flow_mod_start (ofproto=0x5627778cc420,
ofm=ofm@entry=0x7fd3057235f0) at ofproto/ofproto.c:8112
#11 0x562773d0225a in ofproto_flow_mod_learn_start
(ofm=ofm@entry=0x7fd3057235f0) at ofproto/ofproto.c:5491
#12 0x562773d040ad in ofproto_flow_mod_learn
(ofm=ofm@entry=0x7fd3057235f0, keep_ref=,
limit=, below_limitp=below_limitp@entry=0x7fd305723510)
at ofproto/ofproto.c:5576
#13 0x562773d2641e in xlate_learn_action
(ctx=ctx@entry=0x7fd305729a60, learn=learn@entry=0x562777db4618) at
ofproto/ofproto-dpif-xlate.c:5547
#14 0x562773d2aafb in do_xlate_actions (ofpacts=,
ofpacts_len=, ctx=0x7fd305729a60,
is_last_action=, group_bucket_action=)
at ofproto/ofproto-dpif-xlate.c:7232
#15 0x562773d26c85 in xlate_recursively
(actions_xlator=0x562773d29490 ,
is_last_action=true, deepens=false, rule=0x562777db4470,
ctx=0x7fd305729a60)
at ofproto/ofproto-dpif-xlate.c:4383
#16 xlate_table_action (ctx=0x7fd305729a60, in_port=,
table_id=, may_packet_in=,
honor_table_miss=, with_ct_orig=,
is_last_action=true, xlator=0x562773d29490 ) at
ofproto/ofproto-dpif-xlate.c:4512
#17 0x562773d2ab8d in xlate_ofpact_resubmit
(resubmit=0x56277781db28, resubmit=0x56277781db28,
resubmit=0x56277781db28, is_last_action=true, ctx=0x7fd305729a60)
at ofproto/ofproto-dpif-xlate.c:4823
#18 do_xlate_actions (ofpacts=, ofpacts_len=, ctx=0x7fd305729a60, is_last_action=,
group_bucket_action=)
at ofproto/ofproto-dpif-xlate.c:7107
#19 0x562773d26c85 in xlate_recursively
(actions_xlator=0x562773d29490 ,
is_last_action=true, deepens=false, rule=0x562777ab9220,
ctx=0x7fd305729a60)
at ofproto/ofproto-dpif-xlate.c:4383
#20 xlate_table_action (ctx=0x7fd305729a60, in_port=,
table_id=, may_packet_in=,
honor_table_miss=, with_ct_orig=,
is_last_action=true, xlator=0x562773d29490 ) at
ofproto/ofproto-dpif-xlate.c:4512
#21 0x562773d2ab8d in xlate_ofpact_resubmit
(resubmit=0x562777b9f3c8, 

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Eelco Chaudron via discuss



On 19 Feb 2024, at 13:09, Ilya Maximets wrote:

> On 2/19/24 11:14, Eelco Chaudron wrote:
>>
>>
>> On 19 Feb 2024, at 10:34, LIU Yulong wrote:
>>
>>> Hi OVS experts,
>>>
>>> Our ovs-vswitchd runs to core at the ovs_mutex_trylock(>mutex) in the
>>> function revalidator_sweep__.
>>>
>>> I've sent the mail before but have no response.
>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/052604.html
>>>
>>> So I'm trying to send this mail again. And I may apologize in advance 
>>> because
>>> I would like to post as much useful information as possible to help identify
>>> potential issues. So this mail will have a really long text.
>>>
>>> Compared to the mail 2023-August/052604.html, we upgrade the OVS to 2.17.8
>>> and DPDK to 22.11 to pray for good luck that maybe the community has 
>>> potential
>>> fixes for this issue. But unfortunately, the ovs-vswitchd still runs to 
>>> core.
>>
>> As you mentioned it looks like some memory corruption, which I have not seen 
>> before.
>>
>> Have you tried this without rte offload? This is the only feature I never 
>> used.
>> There is a 2.17.9 with DPDK 22.11.6 you could try.
>
> OVS 2.17 is not supposed to work with DPDK 22.11, it's supposed to work with 
> 21.11.
> See the compatibility table here:
>  https://docs.openvswitch.org/en/latest/faq/releases/
>
> Though it's hard to tell if DPDK version is anyhow related to the issue.

My mistake, I was supposed to type 21.11.6 :( But yes if they are using 22.11, 
that could also be the problem. I would suggest using the supported version and 
see if the problem goes away.

//Eelco

> Best regards, Ilya Maximets.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Ilya Maximets via discuss
On 2/19/24 11:14, Eelco Chaudron wrote:
> 
> 
> On 19 Feb 2024, at 10:34, LIU Yulong wrote:
> 
>> Hi OVS experts,
>>
>> Our ovs-vswitchd runs to core at the ovs_mutex_trylock(>mutex) in the
>> function revalidator_sweep__.
>>
>> I've sent the mail before but have no response.
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/052604.html
>>
>> So I'm trying to send this mail again. And I may apologize in advance because
>> I would like to post as much useful information as possible to help identify
>> potential issues. So this mail will have a really long text.
>>
>> Compared to the mail 2023-August/052604.html, we upgrade the OVS to 2.17.8
>> and DPDK to 22.11 to pray for good luck that maybe the community has 
>> potential
>> fixes for this issue. But unfortunately, the ovs-vswitchd still runs to core.
> 
> As you mentioned it looks like some memory corruption, which I have not seen 
> before.
> 
> Have you tried this without rte offload? This is the only feature I never 
> used.
> There is a 2.17.9 with DPDK 22.11.6 you could try.

OVS 2.17 is not supposed to work with DPDK 22.11, it's supposed to work with 
21.11.
See the compatibility table here:
  https://docs.openvswitch.org/en/latest/faq/releases/

Though it's hard to tell if DPDK version is anyhow related to the issue.

Best regards, Ilya Maximets.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Eelco Chaudron via discuss



On 19 Feb 2024, at 10:34, LIU Yulong wrote:

> Hi OVS experts,
>
> Our ovs-vswitchd runs to core at the ovs_mutex_trylock(>mutex) in the
> function revalidator_sweep__.
>
> I've sent the mail before but have no response.
> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/052604.html
>
> So I'm trying to send this mail again. And I may apologize in advance because
> I would like to post as much useful information as possible to help identify
> potential issues. So this mail will have a really long text.
>
> Compared to the mail 2023-August/052604.html, we upgrade the OVS to 2.17.8
> and DPDK to 22.11 to pray for good luck that maybe the community has potential
> fixes for this issue. But unfortunately, the ovs-vswitchd still runs to core.

As you mentioned it looks like some memory corruption, which I have not seen 
before.

Have you tried this without rte offload? This is the only feature I never used. 
There is a 2.17.9 with DPDK 22.11.6 you could try.

//Eelco

> Here are some local debug information:
>
> (gdb) bt
> #0  0x7f8751bbf337 in raise () from /lib64/libc.so.6
> #1  0x7f8751bc0a28 in abort () from /lib64/libc.so.6
> #2  0x55c52ed06c7e in ovs_abort_valist (err_no=,
> format=, args=args@entry=0x7f8744249370) at
> lib/util.c:499
> #3  0x55c52ed06d14 in ovs_abort (err_no=err_no@entry=0,
> format=format@entry=0x55c52f01b1e8 "%s: %s() passed uninitialized
> ovs_mutex") at lib/util.c:491
> #4  0x55c52ecd17e1 in ovs_mutex_trylock_at
> (l_=l_@entry=0x7f8718dcc098, where=where@entry=0x55c52eff5c60
> "ofproto/ofproto-dpif-upcall.c:3044") at lib/ovs-thread.c:106
> #5  0x55c52ebf25f9 in revalidator_sweep__
> (revalidator=revalidator@entry=0x55c533082c70,
> purge=purge@entry=false) at ofproto/ofproto-dpif-upcall.c:3044
> #6  0x55c52ebf640f in revalidator_sweep
> (revalidator=0x55c533082c70) at ofproto/ofproto-dpif-upcall.c:3102
> #7  udpif_revalidator (arg=0x55c533082c70) at 
> ofproto/ofproto-dpif-upcall.c:1101
> #8  0x55c52ecd239f in ovsthread_wrapper (aux_=) at
> lib/ovs-thread.c:422
> #9  0x7f8753d16e65 in start_thread () from /lib64/libpthread.so.0
> #10 0x7f8751c8788d in clone () from /lib64/libc.so.6
>
> bt output with pretty print
> (gdb) bt full
> #0  0x7f8751bbf337 in raise () from /lib64/libc.so.6
> No symbol table info available.
> #1  0x7f8751bc0a28 in abort () from /lib64/libc.so.6
> No symbol table info available.
> #2  0x55c52ed06c7e in ovs_abort_valist (err_no=,
> format=, args=args@entry=0x7f8744249370) at
> lib/util.c:499
> No locals.
> #3  0x55c52ed06d14 in ovs_abort (err_no=err_no@entry=0,
> format=format@entry=0x55c52f01b1e8 "%s: %s() passed uninitialized
> ovs_mutex") at lib/util.c:491
> args = {{
> gp_offset = 32,
> fp_offset = 48,
> overflow_arg_area = 0x7f8744249450,
> reg_save_area = 0x7f8744249390
>   }}
> #4  0x55c52ecd17e1 in ovs_mutex_trylock_at
> (l_=l_@entry=0x7f8718dcc098, where=where@entry=0x55c52eff5c60
> "ofproto/ofproto-dpif-upcall.c:3044") at lib/ovs-thread.c:106
> l = 0x7f8718dcc098
> error = 
> __func__ = "ovs_mutex_trylock_at"
> #5  0x55c52ebf25f9 in revalidator_sweep__
> (revalidator=revalidator@entry=0x55c533082c70,
> purge=purge@entry=false) at ofproto/ofproto-dpif-upcall.c:3044
> ukey_state = 
> cursor_52 = {
>   impl = 0x7f86f826c8c0,
>   bucket_idx = 2,
>   entry_idx = 3,
>   node = 0x7f86f8dc8020
> }
> odp_actions_stub = {140218217949376, 9951266880679575560,
> 55834640392, 10323741882, 0, 2, 140218217949592, 168, 140218217949416,
> 12885426177, 140218940560728, 281510948569217, 1125934266581005,
> 281509421383809,
>   1099511627785, 0, 140218226622112, 4, 819, 10323556447, 0,
> 2, 140218226622328, 148, 140218226622152, 12885753857,
> 140218940560856, 2984229694, 18446744069414584320,
> 18446462603027808255, 4294967295, 1407392063422464,
>   140218257906000, 3, 248, 10322875848, 0, 2, 140218257906216,
> 168, 140218257906040, 12884901889, 140218940560984, 0, 0, 0, 0, 0,
> 140218228630736, 5, 1088, 10292016731, 24, 2, 140218228630952, 168,
> 140218228630776,
>   12884901889, 140218940561112, 0, 0, 0, 0, 0,
> 140218255617104, 2, 120, 10124105851, 0, 2, 140218255617320, 148,
> 140218255617144, 12884901889, 140218940561240, 0, 0, 0, 0, 0,
> 140218243823504, 0, 0, 0, 0, 2,
>   140218243823720, 140, 140218243823544, 12884901889,
> 140218940561368, 0 }
> odp_actions = {
>   base = 0x7f8744249550,
>   data = 0x7f8744249550,
>   size = 0,
>   allocated = 1024,
>   header = 0x0,
>   msg = 0x0,
>   list_node = {
> prev = 0x,
> next = 0x
>   },
>   source = OFPBUF_STUB
> }
> ukey = 0x7f8718dcc050
> n_ops = 0
> ops = {
> ... a really long 

[ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread LIU Yulong via discuss
Hi OVS experts,

Our ovs-vswitchd runs to core at the ovs_mutex_trylock(>mutex) in the
function revalidator_sweep__.

I've sent the mail before but have no response.
https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/052604.html

So I'm trying to send this mail again. And I may apologize in advance because
I would like to post as much useful information as possible to help identify
potential issues. So this mail will have a really long text.

Compared to the mail 2023-August/052604.html, we upgrade the OVS to 2.17.8
and DPDK to 22.11 to pray for good luck that maybe the community has potential
fixes for this issue. But unfortunately, the ovs-vswitchd still runs to core.

Here are some local debug information:

(gdb) bt
#0  0x7f8751bbf337 in raise () from /lib64/libc.so.6
#1  0x7f8751bc0a28 in abort () from /lib64/libc.so.6
#2  0x55c52ed06c7e in ovs_abort_valist (err_no=,
format=, args=args@entry=0x7f8744249370) at
lib/util.c:499
#3  0x55c52ed06d14 in ovs_abort (err_no=err_no@entry=0,
format=format@entry=0x55c52f01b1e8 "%s: %s() passed uninitialized
ovs_mutex") at lib/util.c:491
#4  0x55c52ecd17e1 in ovs_mutex_trylock_at
(l_=l_@entry=0x7f8718dcc098, where=where@entry=0x55c52eff5c60
"ofproto/ofproto-dpif-upcall.c:3044") at lib/ovs-thread.c:106
#5  0x55c52ebf25f9 in revalidator_sweep__
(revalidator=revalidator@entry=0x55c533082c70,
purge=purge@entry=false) at ofproto/ofproto-dpif-upcall.c:3044
#6  0x55c52ebf640f in revalidator_sweep
(revalidator=0x55c533082c70) at ofproto/ofproto-dpif-upcall.c:3102
#7  udpif_revalidator (arg=0x55c533082c70) at ofproto/ofproto-dpif-upcall.c:1101
#8  0x55c52ecd239f in ovsthread_wrapper (aux_=) at
lib/ovs-thread.c:422
#9  0x7f8753d16e65 in start_thread () from /lib64/libpthread.so.0
#10 0x7f8751c8788d in clone () from /lib64/libc.so.6

bt output with pretty print
(gdb) bt full
#0  0x7f8751bbf337 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7f8751bc0a28 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x55c52ed06c7e in ovs_abort_valist (err_no=,
format=, args=args@entry=0x7f8744249370) at
lib/util.c:499
No locals.
#3  0x55c52ed06d14 in ovs_abort (err_no=err_no@entry=0,
format=format@entry=0x55c52f01b1e8 "%s: %s() passed uninitialized
ovs_mutex") at lib/util.c:491
args = {{
gp_offset = 32,
fp_offset = 48,
overflow_arg_area = 0x7f8744249450,
reg_save_area = 0x7f8744249390
  }}
#4  0x55c52ecd17e1 in ovs_mutex_trylock_at
(l_=l_@entry=0x7f8718dcc098, where=where@entry=0x55c52eff5c60
"ofproto/ofproto-dpif-upcall.c:3044") at lib/ovs-thread.c:106
l = 0x7f8718dcc098
error = 
__func__ = "ovs_mutex_trylock_at"
#5  0x55c52ebf25f9 in revalidator_sweep__
(revalidator=revalidator@entry=0x55c533082c70,
purge=purge@entry=false) at ofproto/ofproto-dpif-upcall.c:3044
ukey_state = 
cursor_52 = {
  impl = 0x7f86f826c8c0,
  bucket_idx = 2,
  entry_idx = 3,
  node = 0x7f86f8dc8020
}
odp_actions_stub = {140218217949376, 9951266880679575560,
55834640392, 10323741882, 0, 2, 140218217949592, 168, 140218217949416,
12885426177, 140218940560728, 281510948569217, 1125934266581005,
281509421383809,
  1099511627785, 0, 140218226622112, 4, 819, 10323556447, 0,
2, 140218226622328, 148, 140218226622152, 12885753857,
140218940560856, 2984229694, 18446744069414584320,
18446462603027808255, 4294967295, 1407392063422464,
  140218257906000, 3, 248, 10322875848, 0, 2, 140218257906216,
168, 140218257906040, 12884901889, 140218940560984, 0, 0, 0, 0, 0,
140218228630736, 5, 1088, 10292016731, 24, 2, 140218228630952, 168,
140218228630776,
  12884901889, 140218940561112, 0, 0, 0, 0, 0,
140218255617104, 2, 120, 10124105851, 0, 2, 140218255617320, 148,
140218255617144, 12884901889, 140218940561240, 0, 0, 0, 0, 0,
140218243823504, 0, 0, 0, 0, 2,
  140218243823720, 140, 140218243823544, 12884901889,
140218940561368, 0 }
odp_actions = {
  base = 0x7f8744249550,
  data = 0x7f8744249550,
  size = 0,
  allocated = 1024,
  header = 0x0,
  msg = 0x0,
  list_node = {
prev = 0x,
next = 0x
  },
  source = OFPBUF_STUB
}
ukey = 0x7f8718dcc050
n_ops = 0
ops = {
... a really long list 
   {
ukey = 0x0,
stats = {

  n_packets = 0,
  n_bytes = 0,
  used = 0,
  tcp_flags = 0
},
dop = {
  type = 0,
  error = 0,
  {
flow_put = {
  flags = (DPIF_FP_CREATE | unknown: 905450480),
  key = 0x50178a28944953ad,
  key_len = 12884901889,
  mask = 0x0,