I updated the README.md for the test project this morning to show a use-case that leaks connections without calling `getState(true)`.
https://github.com/artnaseef/opennms-poc-hs1384/blob/main/README.md Art On Monday, May 1, 2023 at 9:08:20 AM UTC-7 Arthur Naseef wrote: > Correct - the client is not cleaning up those connections. Netstat shows > the ESTABLISHED connections increasing over time. > > The POC test code shows this can happen even when no GPC call is made, but > instead the `channel.getState(true)` call is made by the client > application. It can also show that an attempt to make a GRPC call can > create a connection that is never closed. The README.md in that file has > scenarios listed and instructions to reproduce the symptoms. > > Art > > On Sunday, April 30, 2023 at 2:12:28 AM UTC-7 Sanjay Pujare wrote: > >> Hmmm, so what you are saying is that the current logic assumes that an >> "idle" connection is to be closed from the server side and only then the >> client side will perform the corresponding clean up. >> >> Are you able to see (say with netstat) that connections are getting >> leaked since the client never closes them? And on these connections there >> are no outstanding RPCs? >> >> On Sat, Apr 29, 2023 at 1:23 AM Arthur Naseef <[email protected]> wrote: >> >>> The POC project I linked can be used to see that the client never >>> initiates a close - at least for the Netty client code. So a faulty >>> server/proxy/gateway can cause the client to leak connections >>> indefinitely. Digging through the GRPC + Netty code, I did not find any >>> path that closes the connection except when the socket close is seen by the >>> client. >>> >>> Trying with the OK HTTP implementation, it's better, but I still am >>> running into a problem that the POC does not reproduce. >>> >>> Art >>> >>> >>> On Friday, April 28, 2023 at 12:03:21 PM UTC-7 Yuri Golobokov wrote: >>> >>>> Yes, it should close. But I'm not sure if the client or the nginx >>>> initiates the closing. >>>> >>>> On Fri, Apr 28, 2023 at 10:20 AM Arthur Naseef <[email protected]> >>>> wrote: >>>> >>>>> Should the connection close after all calls are complete, or have >>>>> failed to start? >>>>> >>>>> Art >>>>> >>>>> >>>>> On Friday, April 28, 2023 at 9:06:55 AM UTC-7 Yuri Golobokov wrote: >>>>> >>>>>> GOAWAY just prevents new streams(calls) from being started on the >>>>>> connection. If you have live streams on the connection it will stay open >>>>>> until all calls are completed. >>>>>> >>>>>> On Fri, Apr 28, 2023 at 8:44 AM Arthur Naseef <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Let me clarify one point - I was expecting the client to close the >>>>>>> connection after the GOAWAY. Is there a reason to leave the connection >>>>>>> open after that point? >>>>>>> >>>>>>> Art >>>>>>> >>>>>>> >>>>>>> On Friday, April 28, 2023 at 8:42:58 AM UTC-7 Arthur Naseef wrote: >>>>>>> >>>>>>>> Thank you for the response. we are aware of the semantics, and >>>>>>>> they do as advertised - the Channel goes into IDLE on the GOAWAY. >>>>>>>> However, >>>>>>>> the CONNECTION itself lingers indefinitely. So every time we get a >>>>>>>> GOAWAY >>>>>>>> from the server, we leak a connection - until that connection is >>>>>>>> closed by >>>>>>>> the server itself. I was expecting the connection to close after >>>>>>>> receiving >>>>>>>> the GOAWAY. >>>>>>>> >>>>>>>> We call getState with true as a means of ensuring the client does >>>>>>>> its best to keep the connection to the server. The README.md in the >>>>>>>> POC >>>>>>>> project explain why. In short - the server pushes messages to the >>>>>>>> client >>>>>>>> (via GRPC stream), the server cannot initiate the connection to the >>>>>>>> client, >>>>>>>> and the client does not know when the server will send messages. So, >>>>>>>> the >>>>>>>> client does it's best to keep the connection to the server active at >>>>>>>> all >>>>>>>> times. >>>>>>>> >>>>>>>> Art >>>>>>>> >>>>>>>> On Friday, April 28, 2023 at 2:13:58 AM UTC-7 [email protected] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Take a look at >>>>>>>>> https://github.com/grpc/grpc/blob/master/doc/connectivity-semantics-and-api.md >>>>>>>>> >>>>>>>>> - it says "...channels that receive a GOAWAY when there are no active >>>>>>>>> or >>>>>>>>> pending RPCs should also switch to IDLE..." >>>>>>>>> >>>>>>>>> Also according to >>>>>>>>> https://github.com/grpc/grpc-java/blob/master/api/src/main/java/io/grpc/ManagedChannel.java#L78 >>>>>>>>> >>>>>>>>> if you call `getState` with `true` then "the channel will try to make >>>>>>>>> a >>>>>>>>> connection if it is currently IDLE ". And that might explain why your >>>>>>>>> `getState` call itself causes a new connection to be created. I >>>>>>>>> haven't >>>>>>>>> looked at your code in detail but do you need to call `getState` with >>>>>>>>> `true`? Can you try with `false` ? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Friday, April 28, 2023 at 3:13:37 AM UTC+5:30 Arthur Naseef >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I am running into an issue with the GRPC Java client in which the >>>>>>>>>> client leaks connections over time. Reading through the grpc-java >>>>>>>>>> code, >>>>>>>>>> debugging, and instrumenting has led my the following question: >>>>>>>>>> >>>>>>>>>> - Does the netty client code ever close the connection except >>>>>>>>>> when it sees the socket close intiaited externally (i.e. by the >>>>>>>>>> O/S or the >>>>>>>>>> server)? >>>>>>>>>> >>>>>>>>>> Here is a small project that (1) contains a description of the >>>>>>>>>> problem and some of the history related to it, and (2) can be used >>>>>>>>>> to >>>>>>>>>> reproduce the connection leak. >>>>>>>>>> >>>>>>>>>> https://github.com/artnaseef/opennms-poc-hs1384 >>>>>>>>>> >>>>>>>>>> In brief, we see the following: >>>>>>>>>> >>>>>>>>>> - The NGINX ingress times out a request >>>>>>>>>> - The NGINX ingress sends a GOAWAY packet to the client. >>>>>>>>>> - The client channel transitions to IDLE but does not close >>>>>>>>>> the connection. >>>>>>>>>> - The client creates a new connection for the channel, which >>>>>>>>>> transitions to CONNECTING and then READY >>>>>>>>>> - The list of transports for the channel holds the leaked >>>>>>>>>> connections >>>>>>>>>> >>>>>>>>>> Note that switching to the OK HTTP implementation appears to >>>>>>>>>> improve the results with the test tool, but our main application >>>>>>>>>> still >>>>>>>>>> observes leaked connections when running with OK HTTP. >>>>>>>>>> >>>>>>>>>> Any help is appreciated. I can certainly share more details as >>>>>>>>>> needed. >>>>>>>>>> >>>>>>>>>> Art >>>>>>>>>> >>>>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "grpc.io" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/grpc-io/d9a8271c-ee25-4b76-a802-546c69e4cedbn%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/grpc-io/d9a8271c-ee25-4b76-a802-546c69e4cedbn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "grpc.io" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/grpc-io/3d0e43eb-8495-40a9-bea3-8598e2402429n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/grpc-io/3d0e43eb-8495-40a9-bea3-8598e2402429n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/90ebf376-c4df-4b36-84d1-67bc670981c4n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/grpc-io/90ebf376-c4df-4b36-84d1-67bc670981c4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/4b13fa76-ea45-4c6a-8471-8cdc2bd7309dn%40googlegroups.com.
