On Mon, Jul 1, 2019 at 12:42 PM 'Yunchi Luo' via golang-nuts <
golang-nuts@googlegroups.com> wrote:

> Hello, I'd like to solicit some help with a weird GC issue we are seeing.
>
> I'm trying to debug OOM on a service we are running in k8s. The service is
> just a CRUD server hitting a database (DynamoDB). Each replica serves about
> 300 qps of traffic. There are no memory leaks. On occasion (seemingly
> correlated to small latency spikes on the backend), the service would OOM.
> This is surprising because it has a circuit breaker that drops requests
> after 200 concurrent connections that has never trips, and goroutine
> profiles confirm that there are nowhere 200 active goroutines.
>

Just curious about the network connections.
Is there a chance that the network connections are not getting closed and
cleaned up for some reason.
It was common for sockets to hang around in the thousands because user was
killing a slow tab or the browser
and the full socket close never completed.    The solution was  to allow
reliable connections to time out and finish closing
freeing up the memory.   The application has closed the socket but the
protocol has yet to get the last packet to complete the
handshake.  The shell equivalent would be zombe processes that still need
to return the exit status but no process waits
on the status.   Debugging can be interesting in the shell case because of
implied waits done by ps.

How many connections does the the system kernel think there are and what
state are they are in.
Look both locally and on the DB machine.
The latency spikes can be a cause or a symptom.
Look at the connections being made to the CRUD server and make sure they
are being setup with short enough timers
that they clean themselves up quickly enough.   Is the CRUD server at risk
of a denial of service or random curious probe burst
from a nmap script.  Even firewall drops near or far can leave connections
hanging in an incomplete state when an invalid connection
is detected and blocked and long timer reliable network connections are
involved.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAAMy4UTgoxygmb5UE%2BtUhDgdkS--VmtbT5U8kF0aXQgwXPwA0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to