On 2/13/25 10:05 AM, [email protected] wrote:
> On Thu, 2025-02-13 at 09:51 +0100, Dumitru Ceara wrote:
>> On 2/13/25 9:39 AM, Felix Huettner wrote:
>>
>>>>>>
>>>>>> Thanks, I will add it. In the meantime I'm also fighting with
>>>>>> segfaults
>>>>>> that appeared when I was adding tests. Tests add "stuff" to
>>>>>> the OVN
>>>>>> in
>>>>>> big batches and It crashes fairly consistently (though not
>>>>>> always).
>>>>>> This didn't occur in my manual testing.
>>>>>> Sometimes the crashes are silent, but from time to time,
>>>>>> there are
>>>>>> transaction errors complaining about referencing non-existing
>>>>>> DPs
>>>>>> or
>>>>>> PBs (both for logical ports and tracked ports). I don't
>>>>>> presume
>>>>>> that
>>>>>> your patch will fix that, since it happens with pure LBs too.
>>>>>> I wonder if it's enough to have dependency on lr_stateful. Is
>>>>>> it
>>>>>> possible that we need to depend on something else? Like
>>>>>> "en_sync_to_sb_pb"?
>>>>>>
>>>>>
>>>>> Would it be possible to share such a test?  I can look into it
>>>>> tomorrow.
>>>>
>>>> Sure thing, It's in this branch
>>>> https://github.com/mkalcok/ovn/commit/8e1394dfbebc4f48255c77937015e07fbcbc5172
>>>> . It's based on your updated v5 branch with only one more commit
>>>> that
>>>> adds the test. It won't pass either way because the branch is
>>>> missing
>>>> advertisement of neighboring LBs. I have that change in another
>>>> branch,
>>>> but for the purpose of investigating the crash, this should be
>>>> good
>>>> enough. I can get it to crash about every other run. The tell-
>>>> tell sign
>>>> is when the test hangs (because it waits on --wait=sb in some
>>>> step).
>>>
>>> Hi Martin, Hi Dumitru,
>>>
>>> i think i found the issue.
>>> Fix is at
>>> https://github.com/felixhuettner/ovn/commit/5e1b6051ddb0857382951d6e3a383e657063f303
>>>
>>> However this is "just" a use-after-free and i don't think it would
>>> generate transaction errors.
>>>
>>
>> Yes, that's my bad, good catch Felix!
>>
>> Well, as Martin replied in his other email [0], the transaction
>> errors
>> are about rows with UUID "a5a5a5a5-a5a5-a5a5-a5a5-a5a5a5a5a5a5". 
>> That's
>> just "poison" inserted at free (as Ales pointed out too [1]).
>>
>> [0]
>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/421022.html
>> [1]
>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/421023.html
>>
>> Regards,
>> Dumitru
>>
> Hi Felix, Dumitru, Ales,
> 
> I think you nailed it. I applied the change from Felix and tests are
> passing consistently \o/
> I'll get back to adding tests and I'll post new version asap.
> 

Thanks, Martin!

Based on some internal early feedback I got from Ales I split the first
commit into two changes for better readability.  I pushed the resulting
commits here:

- northd: Expose the is_l3dgw_port() function to other modules.
  https://github.com/dceara/ovn/commit/c7097bd
- northd: Check and populate NAT entry fields early in the I-P engine.
  https://github.com/dceara/ovn/commit/086711b

It would be great if you could rebase your changes on top of these and
include them in v6.

Best regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to