Hi, Understood and yes, connect will synchronously fail if port is not available, so you should be able to retry it later.
Regards, Florin > On Mar 20, 2023, at 1:58 AM, Zhang Dongya <fortitude.zh...@gmail.com> wrote: > > Hi, > > It seems the issue occurs when there are disconnect called because our > network can't guarantee a tcp can't be reset even when 3 ways handshake is > completed (firewall issue :( ). > > When we find the app layer timeout, we will first disconnect (because we > record the session handle, this session might be a half open session), does > vnet session layer guarantee that if we reconnect from master thread when the > half open session still not be released yet (due to asynchronous logic) that > the reconnect fail? if then we can retry connect later. > > I prefer to not registered half open callback because I think it make app > complicated from a TCP programming prospective. > > For your patch, I think it should be work because I can't delete the half > open session immediately because there is worker configured, so the half open > will be removed from bihash when syn retrans timeout. I have merged the patch > and will provide feedback later. > > Florin Coras <fcoras.li...@gmail.com <mailto:fcoras.li...@gmail.com>> > 于2023年3月20日周一 13:09写道: >> Hi, >> >> Inline. >> >>> On Mar 19, 2023, at 6:47 PM, Zhang Dongya <fortitude.zh...@gmail.com >>> <mailto:fortitude.zh...@gmail.com>> wrote: >>> >>> Hi, >>> >>> It can be aborted both in established state or half open state because I >>> will do timeout in our app layer. >> >> [fc] Okay! Is the issue present irrespective of the state of the session or >> does it happen only after a disconnect in hanf-open state? More lower. >> >>> >>> Regarding your question, >>> >>> - Yes we add a builtin in app relys on C apis that mainly use >>> vnet_connect/disconnect to connect or disconnect session. >> >> [fc] Understood >> >>> - We call these api in a vpp ctrl process which should be running on the >>> master thread, we never do session setup/teardown on worker thread. (the >>> environment that found this issue is configured with 1 master + 1 worker >>> setup.) >> >> [fc] With vpp latest it’s possible to connect from first workers. It’s an >> optimization meant to avoid 1) worker barrier on syns and 2) entering poll >> mode on main (consume less cpu) >> >>> - We started to develop the app using 22.06 and I keep to merge upstream >>> changes to latest vpp by cherry-picking. The reason for line mismatch is >>> that I added some comment to the session layer code, it should be equal to >>> the master branch now. >> >> [fc] Ack >> >>> >>> When reading the code I understand that we mainly want to cleanup half open >>> from bihash in session_stream_connect_notify, however, in syn-sent state if >>> I choose to close the session, the session might be closed by my app due to >>> session setup timeout (in second scale), in that case, session will be >>> marked as half_open_done and half open session will be freed shortly in the >>> ctrl thread (the 1st worker?). >> >> [fc] Actually, this might be the issue. We did start to provide a half-open >> session handle to apps which if closed does clean up the session but >> apparently it is missing the cleanup of the session lookup table. Could you >> try this patch [1]? It might need additional work. >> >> Having said that, forcing a close/cleanup will not free the port >> synchronously. So, if you’re using fixed ports, you’ll have to wait for the >> half-open cleanup notification. >> >>> >>> Should I also registered half open callback or there are some other reason >>> that lead to this failure? >>> >> >> [fc] Yes, see above. >> >> Regards, >> Florin >> >> [1] https://gerrit.fd.io/r/c/vpp/+/38526 >> >>> >>> Florin Coras <fcoras.li...@gmail.com <mailto:fcoras.li...@gmail.com>> >>> 于2023年3月20日周一 06:22写道: >>>> Hi, >>>> >>>> When you abort the connection, is it fully established or half-open? >>>> Half-opens are cleaned up by the owner thread after a timeout, but the >>>> 5-tuple should be assigned to the fully established session by that point. >>>> tcp_half_open_connection_cleanup does not cleanup the bihash instead >>>> session_stream_connect_notify does once tcp connect returns either success >>>> or failure. >>>> >>>> So a few questions: >>>> - is it accurate to assume you have a builtin vpp app and rely only on C >>>> apis to interact with host stack? >>>> - on what thread (main or first worker) do you call vnet_connect? >>>> - what api do you use to close the session? >>>> - what version of vpp is this because lines don’t match vpp latest? >>>> >>>> Regards, >>>> Florin >>>> >>>> > On Mar 19, 2023, at 2:08 AM, Zhang Dongya <fortitude.zh...@gmail.com >>>> > <mailto:fortitude.zh...@gmail.com>> wrote: >>>> > >>>> > Hi list, >>>> > >>>> > recently in our application, we constantly triggered such abrt issue >>>> > which make our connectivity interrupt for a while: >>>> > >>>> > Mar 19 16:11:26 ubuntu vnet[2565933]: received signal SIGABRT, PC >>>> > 0x7fefd3b2000b >>>> > Mar 19 16:11:26 ubuntu vnet[2565933]: >>>> > /home/fortitude/glx/vpp/src/vnet/tcp/tcp_input.c:3004 >>>> > (tcp46_input_inline) assertion `tcp_lookup_is_valid (tc0, b[0], >>>> > tcp_buffer_hdr (b[0]))' fails >>>> > >>>> > Our scenario is quite simple, we will make 4 parallel tcp connection >>>> > (use 4 fixed source ports) to a remote vpp stack (fixed ip and port), >>>> > and will do some keepalive in our application layer, since we only use >>>> > the vpp tcp stack to make the middle box happy with the connection, we >>>> > do not use the data transport of tcp statck actually. >>>> > >>>> > However, since the network condition is complex, we have to always need >>>> > to abrt the connection and reconnect. >>>> > >>>> > I keep to merge upstream session and tcp fix however the issue still not >>>> > fixed, what I found now it may be in some case >>>> > tcp_half_open_connection_cleanup may not deleted the half open session >>>> > from the lookup table (bihash) and the session index is realloced by >>>> > other connection. >>>> > >>>> > Hope the list can provide some hint about how to overcome this issue, >>>> > thanks a lot. >>>> > >>>> > >>>> > >>>> >>>> >>>> >>>> >>> >> >> >> >> > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22733): https://lists.fd.io/g/vpp-dev/message/22733 Mute This Topic: https://lists.fd.io/mt/97707823/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-