Hi,

It can be aborted both in established state or half open state because I
will do timeout in our app layer.

Regarding your question,

- Yes we add a builtin in app relys on C apis that  mainly use
vnet_connect/disconnect to connect or disconnect session.
- We call these api in a vpp ctrl process which should be running on the
master thread, we never do session setup/teardown on worker thread. (the
environment that found this issue is configured with 1 master + 1 worker
setup.)
- We started to develop the app using 22.06 and I keep to merge upstream
changes to latest vpp by cherry-picking. The reason for line mismatch is
that I added some comment to the session layer code, it should be equal to
the master branch now.

When reading the code I understand that we mainly want to cleanup half open
from bihash in session_stream_connect_notify, however, in syn-sent state if
I choose to close the session, the session might be closed by my app due to
session setup timeout (in second scale), in that case, session will be
marked as half_open_done and half open session will be freed shortly in the
ctrl thread (the 1st worker?).

Should I also registered half open callback or there are some other reason
that lead to this failure?


Florin Coras <fcoras.li...@gmail.com> 于2023年3月20日周一 06:22写道:

> Hi,
>
> When you abort the connection, is it fully established or half-open?
> Half-opens are cleaned up by the owner thread after a timeout, but the
> 5-tuple should be assigned to the fully established session by that point.
> tcp_half_open_connection_cleanup does not cleanup the bihash instead
> session_stream_connect_notify does once tcp connect returns either success
> or failure.
>
> So a few questions:
> - is it accurate to assume you have a builtin vpp app and rely only on C
> apis to interact with host stack?
> - on what thread (main or first worker) do you call vnet_connect?
> - what api do you use to close the session?
> - what version of vpp is this because lines don’t match vpp latest?
>
> Regards,
> Florin
>
> > On Mar 19, 2023, at 2:08 AM, Zhang Dongya <fortitude.zh...@gmail.com>
> wrote:
> >
> > Hi list,
> >
> > recently in our application, we constantly triggered such abrt issue
> which make our connectivity interrupt for a while:
> >
> > Mar 19 16:11:26 ubuntu vnet[2565933]: received signal SIGABRT, PC
> 0x7fefd3b2000b
> > Mar 19 16:11:26 ubuntu vnet[2565933]:
> /home/fortitude/glx/vpp/src/vnet/tcp/tcp_input.c:3004 (tcp46_input_inline)
> assertion `tcp_lookup_is_valid (tc0, b[0], tcp_buffer_hdr (b[0]))' fails
> >
> > Our scenario is quite simple, we will make 4 parallel tcp connection
> (use 4 fixed source ports) to a remote vpp stack (fixed ip and port), and
> will do some keepalive in our application layer, since we only use the vpp
> tcp stack to make the middle box happy with the connection, we do not use
> the data transport of tcp statck actually.
> >
> > However, since the network condition is complex, we have to  always need
> to abrt the connection and reconnect.
> >
> > I keep to merge upstream session and tcp fix however the issue still not
> fixed, what I found now it may be in some case
> tcp_half_open_connection_cleanup may not deleted the half open session from
> the lookup table (bihash) and the session index is realloced by other
> connection.
> >
> > Hope the list can provide some hint about how to overcome this issue,
> thanks a lot.
> >
> >
> >
>
>
> 
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22727): https://lists.fd.io/g/vpp-dev/message/22727
Mute This Topic: https://lists.fd.io/mt/97707823/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to