Hi,

On Mon, Feb 26, 2024 at 10:10:51PM +0800, 上勾拳 wrote:
> Hi Roman,
> 
> Thanks a bunch for your quick response, it really made a difference for me.
> 
> I went through Patch 3, and it's pretty cool! And about Patch 4, which also
> addresses the reload route issue, I would like to share an experience from
> utilizing this solution in a production environment. Unfortunately, I
> encountered a significant challenge that surpasses the race condition
> between bind and connect. Specifically, this solution led to a notable
> performance degradation in the kernel's UDP packet lookup under high
> concurrency scenarios. The excessive number of client sockets caused the
> UDP hash table lookup performance degrade into a linked list, because the
> udp hashtable is based on target ip and target port hash.

Thanks for the feedback.  Apparently current UDP stack is just not made for
client sockets.

> To address this
> lookup performance issue, we implemented a proprietary kernel patch that
> introduces a 4-tuple hash table for UDP socket lookup.

Sounds great.  Also I wish there was a patch that would eliminate the race
condition as well.  This would be something like accept() for UDP with
atomic bind+connect and would not require extra privileges for bind().

> Although effective,
> it appears that the eBPF solution is more versatile and universal.

eBPF comes with its own issues as well, you now need privileges for it.

> Big thanks again for your attention!
> 
> Best Regards,
> Zhenzhong
> 
> <nginx-devel-requ...@nginx.org> 于2024年2月26日周一 20:00写道:
> 
> > Send nginx-devel mailing list submissions to
> >         nginx-devel@nginx.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >         https://mailman.nginx.org/mailman/listinfo/nginx-devel
> > or, via email, send a message with subject or body 'help' to
> >         nginx-devel-requ...@nginx.org
> >
> > You can reach the person managing the list at
> >         nginx-devel-ow...@nginx.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of nginx-devel digest..."
> >
> >
> > Today's Topics:
> >
> >    1. Re: Inquiry Regarding Handling of QUIC Connections During
> >       Nginx Reload (Roman Arutyunyan)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Mon, 26 Feb 2024 15:49:30 +0400
> > From: Roman Arutyunyan <a...@nginx.com>
> > To: nginx-devel@nginx.org
> > Subject: Re: Inquiry Regarding Handling of QUIC Connections During
> >         Nginx Reload
> > Message-ID: <20240226114930.izp2quxwsp2usnvg@N00W24XTQX>
> > Content-Type: text/plain; charset=utf-8
> >
> > Hi,
> >
> > On Sun, Feb 25, 2024 at 03:53:23PM +0800, 上勾拳 wrote:
> > > Hello,
> > >
> > > I hope this email finds you well. I am writing to inquire about the
> > status
> > > of an issue I have encountered regarding the handling of QUIC connections
> > > when Nginx is reloaded.
> > >
> > > Recently, I delved into the Nginx eBPF implementation, specifically
> > > examining how QUIC connection packets are handled, especially during
> > Nginx
> > > reloads. My focus was on ensuring that existing QUIC connection packets
> > are
> > > correctly routed to the appropriate worker after a reload, and the Nginx
> > > eBPF prog have done this job perfectly.
> > >
> > > However, I observed that during a reload, new QUIC connections might be
> > > directed to the old worker. The underlying problem stems from the fact
> > that
> > > new QUIC connections fail to match the eBPF reuseport socket map. The
> > > kernel default logic then routes UDP packets based on the hash UDP
> > 4-tuple
> > > in the reuseport group socket array. Since the old worker's listen FDs
> > > persist in the reuseport group socket array (reuse->socks), there is a
> > > possibility that the old worker may still be tasked with handling new
> > QUIC
> > > connections.
> > >
> > > Given that the old worker should not process new requests, it results in
> > > the old worker not responding to the QUIC new connection. Consequently,
> > > clients have to endure the handshake timeout and retry the connection,
> > > potentially encountering the old worker again, leading to an ineffective
> > > cycle.
> > >
> > > In my investigation, I came across a thread in the nginx-devel mailing
> > list
> > > [https://www.mail-archive.com/nginx-devel@nginx.org/msg10627.html],
> > where
> > > it was mentioned that there would be some work addressing this issue.
> > >
> > > Considering my limited experience with eBPF, I propose a potential
> > > solution. The eBPF program could maintain another eBPF map containing
> > only
> > > the listen sockets of the new worker. When the eBPF program calls
> > > `bpf_sk_select_reuseport` and receives `-ENOENT`, it could utilize this
> > new
> > > eBPF map with the hash UDP 4-tuple to route the new QUIC connection to
> > the
> > > new worker. This approach aims to circumvent the kernel logic routing the
> > > packet to the shutting down worker since removing the old worker's listen
> > > socket from the reuseport group socket array seems unfeasible. Not sure
> > > about whether this solution is a good idea and I also wonder if there are
> > > other solutions for this. I would appreciate any insights or updates you
> > > could provide on this matter. Thank you for your time and consideration.
> >
> > It is true QUIC in nginx does not handle reloads well.  This is a known
> > issue
> > and we are working on improving it.  I made an effort a while back to
> > address
> > QUIC reloads in nginx:
> >
> >
> > https://mailman.nginx.org/pipermail/nginx-devel/2023-January/thread.html#16073
> >
> > Here patch #3 implements ePBF-based solution and patch #4 implements
> > client sockets-based solution.  The client sockets require extra worker
> > process
> > privileges to bind to listen port, which is a serious problem for a typical
> > nginx configuration.  The ePBF solution does not seem to have any problems,
> > but we still need more feedback to proceed with this.  If you apply all 4
> > patches, make sure you disable client sockets using
> > --without-quic_client_sockets.  Otherwise just apply the first 3 patches.
> >
> > Here's a relevant trac ticket:
> >
> > https://trac.nginx.org/nginx/ticket/2528
> >
> > --
> > Roman Arutyunyan
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > nginx-devel mailing list
> > nginx-devel@nginx.org
> > https://mailman.nginx.org/mailman/listinfo/nginx-devel
> >
> >
> > ------------------------------
> >
> > End of nginx-devel Digest, Vol 162, Issue 26
> > ********************************************
> >

> _______________________________________________
> nginx-devel mailing list
> nginx-devel@nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx-devel

--
Roman Arutyunyan
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel

Reply via email to