Hi, On Mon, Feb 26, 2024 at 10:10:51PM +0800, 上勾拳 wrote: > Hi Roman, > > Thanks a bunch for your quick response, it really made a difference for me. > > I went through Patch 3, and it's pretty cool! And about Patch 4, which also > addresses the reload route issue, I would like to share an experience from > utilizing this solution in a production environment. Unfortunately, I > encountered a significant challenge that surpasses the race condition > between bind and connect. Specifically, this solution led to a notable > performance degradation in the kernel's UDP packet lookup under high > concurrency scenarios. The excessive number of client sockets caused the > UDP hash table lookup performance degrade into a linked list, because the > udp hashtable is based on target ip and target port hash.
Thanks for the feedback. Apparently current UDP stack is just not made for client sockets. > To address this > lookup performance issue, we implemented a proprietary kernel patch that > introduces a 4-tuple hash table for UDP socket lookup. Sounds great. Also I wish there was a patch that would eliminate the race condition as well. This would be something like accept() for UDP with atomic bind+connect and would not require extra privileges for bind(). > Although effective, > it appears that the eBPF solution is more versatile and universal. eBPF comes with its own issues as well, you now need privileges for it. > Big thanks again for your attention! > > Best Regards, > Zhenzhong > > <nginx-devel-requ...@nginx.org> 于2024年2月26日周一 20:00写道: > > > Send nginx-devel mailing list submissions to > > nginx-devel@nginx.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://mailman.nginx.org/mailman/listinfo/nginx-devel > > or, via email, send a message with subject or body 'help' to > > nginx-devel-requ...@nginx.org > > > > You can reach the person managing the list at > > nginx-devel-ow...@nginx.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of nginx-devel digest..." > > > > > > Today's Topics: > > > > 1. Re: Inquiry Regarding Handling of QUIC Connections During > > Nginx Reload (Roman Arutyunyan) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Mon, 26 Feb 2024 15:49:30 +0400 > > From: Roman Arutyunyan <a...@nginx.com> > > To: nginx-devel@nginx.org > > Subject: Re: Inquiry Regarding Handling of QUIC Connections During > > Nginx Reload > > Message-ID: <20240226114930.izp2quxwsp2usnvg@N00W24XTQX> > > Content-Type: text/plain; charset=utf-8 > > > > Hi, > > > > On Sun, Feb 25, 2024 at 03:53:23PM +0800, 上勾拳 wrote: > > > Hello, > > > > > > I hope this email finds you well. I am writing to inquire about the > > status > > > of an issue I have encountered regarding the handling of QUIC connections > > > when Nginx is reloaded. > > > > > > Recently, I delved into the Nginx eBPF implementation, specifically > > > examining how QUIC connection packets are handled, especially during > > Nginx > > > reloads. My focus was on ensuring that existing QUIC connection packets > > are > > > correctly routed to the appropriate worker after a reload, and the Nginx > > > eBPF prog have done this job perfectly. > > > > > > However, I observed that during a reload, new QUIC connections might be > > > directed to the old worker. The underlying problem stems from the fact > > that > > > new QUIC connections fail to match the eBPF reuseport socket map. The > > > kernel default logic then routes UDP packets based on the hash UDP > > 4-tuple > > > in the reuseport group socket array. Since the old worker's listen FDs > > > persist in the reuseport group socket array (reuse->socks), there is a > > > possibility that the old worker may still be tasked with handling new > > QUIC > > > connections. > > > > > > Given that the old worker should not process new requests, it results in > > > the old worker not responding to the QUIC new connection. Consequently, > > > clients have to endure the handshake timeout and retry the connection, > > > potentially encountering the old worker again, leading to an ineffective > > > cycle. > > > > > > In my investigation, I came across a thread in the nginx-devel mailing > > list > > > [https://www.mail-archive.com/nginx-devel@nginx.org/msg10627.html], > > where > > > it was mentioned that there would be some work addressing this issue. > > > > > > Considering my limited experience with eBPF, I propose a potential > > > solution. The eBPF program could maintain another eBPF map containing > > only > > > the listen sockets of the new worker. When the eBPF program calls > > > `bpf_sk_select_reuseport` and receives `-ENOENT`, it could utilize this > > new > > > eBPF map with the hash UDP 4-tuple to route the new QUIC connection to > > the > > > new worker. This approach aims to circumvent the kernel logic routing the > > > packet to the shutting down worker since removing the old worker's listen > > > socket from the reuseport group socket array seems unfeasible. Not sure > > > about whether this solution is a good idea and I also wonder if there are > > > other solutions for this. I would appreciate any insights or updates you > > > could provide on this matter. Thank you for your time and consideration. > > > > It is true QUIC in nginx does not handle reloads well. This is a known > > issue > > and we are working on improving it. I made an effort a while back to > > address > > QUIC reloads in nginx: > > > > > > https://mailman.nginx.org/pipermail/nginx-devel/2023-January/thread.html#16073 > > > > Here patch #3 implements ePBF-based solution and patch #4 implements > > client sockets-based solution. The client sockets require extra worker > > process > > privileges to bind to listen port, which is a serious problem for a typical > > nginx configuration. The ePBF solution does not seem to have any problems, > > but we still need more feedback to proceed with this. If you apply all 4 > > patches, make sure you disable client sockets using > > --without-quic_client_sockets. Otherwise just apply the first 3 patches. > > > > Here's a relevant trac ticket: > > > > https://trac.nginx.org/nginx/ticket/2528 > > > > -- > > Roman Arutyunyan > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > nginx-devel mailing list > > nginx-devel@nginx.org > > https://mailman.nginx.org/mailman/listinfo/nginx-devel > > > > > > ------------------------------ > > > > End of nginx-devel Digest, Vol 162, Issue 26 > > ******************************************** > > > _______________________________________________ > nginx-devel mailing list > nginx-devel@nginx.org > https://mailman.nginx.org/mailman/listinfo/nginx-devel -- Roman Arutyunyan _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel