As you can see in the below output from pprof, the golang TLS implementation seems to be in some kind of tight loop in crypto/internal/bigmod.addMulVVW2048 causing CPU starvation, wherein the net.HTTP server stops calling my request handler. Eventually the TLS handshakes fail, and the connections are dropped before my handler ever sees them. I guess I'll look for an even newer version of golang, but wanted to post this here to see if you have any feedback. It seems unlikely to me that go's TLS package would exhibit this type of regression.
$ go tool pprof /tmp/cpu.prof2 File: sgs-server-dev Type: cpu Time: Jan 10, 2024 at 10:10pm (UTC) Duration: 180.20s, Total samples = 892.25s (495.14%) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top 10 Showing nodes accounting for 840.73s, 94.23% of 892.25s total Dropped 1278 nodes (cum <= 4.46s) Showing top 10 nodes out of 52 flat flat% sum% cum cum% 595.80s 66.78% 66.78% 595.80s 66.78% crypto/internal/bigmod.addMulVVW2048 83.10s 9.31% 76.09% 772.62s 86.59% crypto/internal/bigmod.(*Nat).montgomeryMul 53.66s 6.01% 82.10% 53.86s 6.04% crypto/internal/bigmod.(*Nat).assign (inline) 42.83s 4.80% 86.90% 42.93s 4.81% crypto/internal/bigmod.addMulVVW (inline) 23.23s 2.60% 89.51% 25.10s 2.81% crypto/internal/bigmod.(*Nat).shiftIn 14.93s 1.67% 91.18% 14.93s 1.67% crypto/internal/bigmod.(*Nat).sub (inline) 7.75s 0.87% 92.05% 7.75s 0.87% runtime.memmove 7.33s 0.82% 92.87% 7.33s 0.82% runtime.duffzero 7.12s 0.8% 93.67% 7.12s 0.8% runtime/internal/syscall.Syscall6 4.98s 0.56% 94.23% 4.98s 0.56% crypto/sha256.block I am building an executable on this box: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 23.04 Release: 23.04 Codename: lunar $ uname -a Linux AASalad 6.2.0-36-generic #37-Ubuntu SMP PREEMPT_DYNAMIC Wed Oct 4 10:14:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux $ go version go version go1.21.4 linux/amd64 Using this command: GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o a.out -buildvcs=true -a -ldflags '-extldflags "-static"' This executable is then deployed to a Debian 11 host: # lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 11 (bullseye) Release: 11 Codename: bullseye # uname -a Linux sgs-dandelion 5.10.0-26-amd64 #1 SMP Debian 5.10.197-1 (2023-09-29) x86_64 GNU/Linux This was done because without the -static build, the right glibc so's are not found on the target system. When running the executable under high incoming TLS connection load, I observe that the net.HTTP handler is eventually not called and lots of these errors are logged: http: TLS handshake error from ... with this stacktrace: log.(*Logger).output /usr/local/go/src/log/log.go:245 log.(*Logger).Printf /usr/local/go/src/log/log.go:268 net/http.(*Server).logf /usr/local/go/src/net/http/server.go:3212 net/http.(*conn).serve /usr/local/go/src/net/http/server.go:1900 This exe handles high load and lots of incoming TLS connections. I have exhaustively looked for a source of CPU starvation such as excessive numbers of goroutines started by my code, or tight loops, etc. and have not found evidence for that. Here is the output of pprof tree: (pprof) tree Showing nodes accounting for 853.85s, 95.70% of 892.25s total Dropped 1278 nodes (cum <= 4.46s) ----------------------------------------------------------+------------- flat flat% sum% cum cum% calls calls% + context ----------------------------------------------------------+------------- 595.80s 100% | crypto/internal/bigmod.(*Nat).montgomeryMul 595.80s 66.78% 66.78% 595.80s 66.78% | crypto/internal/bigmod.addMulVVW2048 ----------------------------------------------------------+------------- 718.77s 93.03% | crypto/internal/bigmod.(*Nat).Exp 47.75s 6.18% | crypto/internal/bigmod.(*Nat).ExpShort 4.76s 0.62% | crypto/internal/bigmod.(*Nat).montgomeryRepresentation 1.34s 0.17% | crypto/rsa.decrypt 83.10s 9.31% 76.09% 772.62s 86.59% | crypto/internal/bigmod.(*Nat).montgomeryMul 595.80s 77.11% | crypto/internal/bigmod.addMulVVW2048 42.93s 5.56% | crypto/internal/bigmod.addMulVVW (inline) 36.25s 4.69% | crypto/internal/bigmod.(*Nat).maybeSubtractModulus 6.19s 0.8% | crypto/internal/bigmod.(*Nat).reset (inline) 4.19s 0.54% | runtime.duffzero 3.54s 0.46% | runtime.memmove -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/206435a5-4434-4866-ba67-e79d384d7ce2n%40googlegroups.com.