Hello,

I've been struggling with someone all day on a very busy server.   And I'm 
at a loss as to what is going on.   

I have a Go server listening on two ports, like so:

                tls_config := MustGetTlsConfiguration(*ssl_cert, *ssl_key, 
*ssl_ca)
                listener, _ := tls.Listen("tcp", ":"+*ssl_port, tls_config)
                ssl_srv := &http.Server{
                        ReadTimeout:  5 * time.Second,
                        WriteTimeout: 10 * time.Second,
                        Handler:      r_ssl,
                }
                go ssl_srv.Serve(listener)

and

                tls_config := MustGetTlsConfiguration(*ssl_listen_cert, 
*ssl_listen_key, *ssl_listen_ca)
                listener, _ := tls.Listen("tcp", ":"+*ssl_listen_port, 
tls_config)
                ssl_srv := &http.Server{
                        ReadTimeout:  5 * time.Second,
                        WriteTimeout: 10 * time.Second,
                        Handler:      l_ssl,
                }
                go ssl_srv.Serve(listener)

And here's that MustGetTlsConfiguration function:

func MustGetTlsConfiguration(certificateFile string, privateKeyFile string, 
caFile string) *tls.Config { 
        config := &tls.Config{} 
        mycert, certPool := MustLoadCertificates(certificateFile, 
privateKeyFile, caFile) 
        config.Certificates = make([]tls.Certificate, 1) 
        config.Certificates[0] = mycert 
 
        config.RootCAs = certPool 
        config.ClientCAs = certPool 
 
        config.CipherSuites = []uint16{tls.TLS_RSA_WITH_AES_128_CBC_SHA, 
                tls.TLS_RSA_WITH_AES_256_CBC_SHA, 
                tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, 
                tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, 
                tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, 
                tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, 
                tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, 
                tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256} 
 
        config.MinVersion = tls.VersionTLS12 
 
        config.SessionTicketsDisabled = true 
        return config 
} 


The handler for these requests which do not read the response body look 
something like:

func Get(w http.ResponseWrite, r *http.request) {
        w.WriteHeader(http.StatusNotFound)
        w.Write([]byte("some reason"))
}

for responses that do read the request body, its more like:

func Get(w http.ResponseWrite, r *http.request) {
        body, err := ioutil.ReadAll(r.Body)
        if err != nil {
               w.WriteHeader(http.StatusNotFound)
                return
        }
        defer r.Body.Close()
        w.WriteHeader(http.StatusNotFound)
        w.Write([]byte("some reason"))
}


.. I've pruned and simplified this quite a bit, but I've tried to isolate 
it the problem to somewhere in this area...

What essentially happens is that the server starts off fine.   Then as the 
requests start to ramp up, all of a sudden I'll see sockets start to pile 
up that I can see in lsof in a state of "can't identify protocol".  If I 
monitor it very closely I can ocassionally see one or two.  But suddenly 
that number jumps and starts to climb rapidly.   After which point 
connections will go to CLOSE_WAIT status.   I'll end up with about 50 
connections in CLOSE_WAIT and the out of file handles because of "can't 
identify protocol" file handles.  Once its in this frozen state the server 
is non-response to any incoming http request.

I've tried increasing the number of file handles - but it exhibits the same 
behavior just taking longer to get to the frozen state.


I never saw this before upgrading to go 1.5.   I've tried 1.6 and 1.7 - 
same result.

I've also tried adding "defer r.Body.Close()" as the first line of every 
function call.   And I also tried preceding that with "defer 
ioutil.ReadAll(r.Body)".  This didn't seem to make any difference (not that 
I'd really expect it to after looking at the handler code).

I'll try to get a better stack trace, apparently syslog is not getting 
everything from kill -ABRT.  But here's one thing i saw:

Sep 20 08:11:46 ip-10-229-22-47 realtime: goroutine 42 [IO wait]:
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net.runtime_pollWait(0x7f5ba075c0c8, 0x72, 0x4232c3)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/runtime/netpoll.go:160 +0x5e
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net.(*pollDesc).wait(0xc4201eb480, 0x72, 0x41fb78, 0x9399e0)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/fd_poll_runtime.go:73 +0x5b
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net.(*pollDesc).waitRead(0xc4201eb480, 0xdb27a0, 0xc4200600c8)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/fd_poll_runtime.go:78 +0x42
Sep 20 08:11:46 ip-10-229-22-47 realtime: net.(*netFD).accept(0xc4201eb420, 
0x0, 0xdb1320, 0xc42020de40)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/fd_unix.go:419 +0x2b8
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net.(*TCPListener).accept(0xc420022708, 0xc42020a680, 0xc42002fd78, 
0x4229d8)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/tcpsock_posix.go:132 +0x51
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net.(*TCPListener).AcceptTCP(0xc420022708, 0x92a600, 0xda1ad0, 0xc4201eb460)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/tcpsock.go:209 +0x50
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net/http.tcpKeepAliveListener.Accept(0xc420022708, 0xc4201fb140, 0x92a600, 
0xda1ad0, 0x951b60)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/http/server.go:2608 +0x3d
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net/http.(*Server).Serve(0xc420212000, 0xdb5020, 0xc420022708, 0x0, 0x0)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/http/server.go:2273 +0x249
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net/http.(*Server).ListenAndServe(0xc420212000, 0x78, 0x0)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/http/server.go:2219 +0x123
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
net/http.ListenAndServe(0xc420209c9a, 0x5, 0xdb06a0, 0xc420084758, 0x5, 
0x1d)
Sep 20 08:11:46 ip-10-229-22-47 realtime: 
#011/usr/local/go/src/net/http/server.go:2351 +0xef


Any help would be greatly appreciated - I think there must be something I'm 
missing here either in my handler code or setting up the http server, but 
I'm not sure where the issue could be.  This feels like its nestled in the 
lower level network, http/2, or keepalive code.

Oh, I also tried disabling http/2 with GODEBUG=http2server=0, which did 
indeed show different code paths in net/http - so maybe that isn't an issue.

Thanks!

- Greg

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to