Hi, Inline.
> On Mar 19, 2023, at 6:47 PM, Zhang Dongya <fortitude.zh...@gmail.com> wrote: > > Hi, > > It can be aborted both in established state or half open state because I will > do timeout in our app layer. [fc] Okay! Is the issue present irrespective of the state of the session or does it happen only after a disconnect in hanf-open state? More lower. > > Regarding your question, > > - Yes we add a builtin in app relys on C apis that mainly use > vnet_connect/disconnect to connect or disconnect session. [fc] Understood > - We call these api in a vpp ctrl process which should be running on the > master thread, we never do session setup/teardown on worker thread. (the > environment that found this issue is configured with 1 master + 1 worker > setup.) [fc] With vpp latest it’s possible to connect from first workers. It’s an optimization meant to avoid 1) worker barrier on syns and 2) entering poll mode on main (consume less cpu) > - We started to develop the app using 22.06 and I keep to merge upstream > changes to latest vpp by cherry-picking. The reason for line mismatch is that > I added some comment to the session layer code, it should be equal to the > master branch now. [fc] Ack > > When reading the code I understand that we mainly want to cleanup half open > from bihash in session_stream_connect_notify, however, in syn-sent state if I > choose to close the session, the session might be closed by my app due to > session setup timeout (in second scale), in that case, session will be marked > as half_open_done and half open session will be freed shortly in the ctrl > thread (the 1st worker?). [fc] Actually, this might be the issue. We did start to provide a half-open session handle to apps which if closed does clean up the session but apparently it is missing the cleanup of the session lookup table. Could you try this patch [1]? It might need additional work. Having said that, forcing a close/cleanup will not free the port synchronously. So, if you’re using fixed ports, you’ll have to wait for the half-open cleanup notification. > > Should I also registered half open callback or there are some other reason > that lead to this failure? > [fc] Yes, see above. Regards, Florin [1] https://gerrit.fd.io/r/c/vpp/+/38526 > > Florin Coras <fcoras.li...@gmail.com <mailto:fcoras.li...@gmail.com>> > 于2023年3月20日周一 06:22写道: >> Hi, >> >> When you abort the connection, is it fully established or half-open? >> Half-opens are cleaned up by the owner thread after a timeout, but the >> 5-tuple should be assigned to the fully established session by that point. >> tcp_half_open_connection_cleanup does not cleanup the bihash instead >> session_stream_connect_notify does once tcp connect returns either success >> or failure. >> >> So a few questions: >> - is it accurate to assume you have a builtin vpp app and rely only on C >> apis to interact with host stack? >> - on what thread (main or first worker) do you call vnet_connect? >> - what api do you use to close the session? >> - what version of vpp is this because lines don’t match vpp latest? >> >> Regards, >> Florin >> >> > On Mar 19, 2023, at 2:08 AM, Zhang Dongya <fortitude.zh...@gmail.com >> > <mailto:fortitude.zh...@gmail.com>> wrote: >> > >> > Hi list, >> > >> > recently in our application, we constantly triggered such abrt issue which >> > make our connectivity interrupt for a while: >> > >> > Mar 19 16:11:26 ubuntu vnet[2565933]: received signal SIGABRT, PC >> > 0x7fefd3b2000b >> > Mar 19 16:11:26 ubuntu vnet[2565933]: >> > /home/fortitude/glx/vpp/src/vnet/tcp/tcp_input.c:3004 (tcp46_input_inline) >> > assertion `tcp_lookup_is_valid (tc0, b[0], tcp_buffer_hdr (b[0]))' fails >> > >> > Our scenario is quite simple, we will make 4 parallel tcp connection (use >> > 4 fixed source ports) to a remote vpp stack (fixed ip and port), and will >> > do some keepalive in our application layer, since we only use the vpp tcp >> > stack to make the middle box happy with the connection, we do not use the >> > data transport of tcp statck actually. >> > >> > However, since the network condition is complex, we have to always need >> > to abrt the connection and reconnect. >> > >> > I keep to merge upstream session and tcp fix however the issue still not >> > fixed, what I found now it may be in some case >> > tcp_half_open_connection_cleanup may not deleted the half open session >> > from the lookup table (bihash) and the session index is realloced by other >> > connection. >> > >> > Hope the list can provide some hint about how to overcome this issue, >> > thanks a lot. >> > >> > >> > >> >> >> >> > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22729): https://lists.fd.io/g/vpp-dev/message/22729 Mute This Topic: https://lists.fd.io/mt/97707823/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-