As you can see in the below output from pprof, the golang TLS 
implementation seems to be in some kind of tight loop in 
crypto/internal/bigmod.addMulVVW2048 causing CPU starvation, wherein the 
net.HTTP server stops calling my request handler. Eventually the TLS 
handshakes fail, and the connections are dropped before my handler ever 
sees them. I guess I'll look for an even newer version of golang, but 
wanted to post this here to see if you have any feedback. It seems unlikely 
to me that go's TLS package would exhibit this type of regression.

$ go tool pprof /tmp/cpu.prof2
File: sgs-server-dev
Type: cpu
Time: Jan 10, 2024 at 10:10pm (UTC)
Duration: 180.20s, Total samples = 892.25s (495.14%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top 10
Showing nodes accounting for 840.73s, 94.23% of 892.25s total
Dropped 1278 nodes (cum <= 4.46s)
Showing top 10 nodes out of 52
      flat  flat%   sum%        cum   cum%
   595.80s 66.78% 66.78%    595.80s 66.78% 
 crypto/internal/bigmod.addMulVVW2048
    83.10s  9.31% 76.09%    772.62s 86.59% 
 crypto/internal/bigmod.(*Nat).montgomeryMul
    53.66s  6.01% 82.10%     53.86s  6.04% 
 crypto/internal/bigmod.(*Nat).assign (inline)
    42.83s  4.80% 86.90%     42.93s  4.81% 
 crypto/internal/bigmod.addMulVVW (inline)
    23.23s  2.60% 89.51%     25.10s  2.81% 
 crypto/internal/bigmod.(*Nat).shiftIn
    14.93s  1.67% 91.18%     14.93s  1.67% 
 crypto/internal/bigmod.(*Nat).sub (inline)
     7.75s  0.87% 92.05%      7.75s  0.87%  runtime.memmove
     7.33s  0.82% 92.87%      7.33s  0.82%  runtime.duffzero
     7.12s   0.8% 93.67%      7.12s   0.8% 
 runtime/internal/syscall.Syscall6
     4.98s  0.56% 94.23%      4.98s  0.56%  crypto/sha256.block

I am building an executable on this box:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 23.04
Release: 23.04
Codename: lunar

$ uname -a
Linux AASalad 6.2.0-36-generic #37-Ubuntu SMP PREEMPT_DYNAMIC Wed Oct  4 
10:14:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ go version
go version go1.21.4 linux/amd64

Using this command:

GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o a.out -buildvcs=true -a 
-ldflags '-extldflags "-static"'

This executable is then deployed to a Debian 11 host:

# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye

# uname -a
Linux sgs-dandelion 5.10.0-26-amd64 #1 SMP Debian 5.10.197-1 (2023-09-29) 
x86_64 GNU/Linux

This was done because without the -static build, the right glibc so's are 
not found on the target system. When running the executable under high 
incoming TLS connection load, I observe that the net.HTTP handler is 
eventually not called and lots of these errors are logged:

http: TLS handshake error from ...

with this stacktrace:
log.(*Logger).output
/usr/local/go/src/log/log.go:245
log.(*Logger).Printf
/usr/local/go/src/log/log.go:268
net/http.(*Server).logf
/usr/local/go/src/net/http/server.go:3212
net/http.(*conn).serve
/usr/local/go/src/net/http/server.go:1900

This exe handles high load and lots of incoming TLS connections. I have 
exhaustively looked for a source of CPU starvation such as excessive 
numbers of goroutines started by my code, or tight loops, etc. and have not 
found evidence for that.


Here is the output of pprof tree:

(pprof) tree
Showing nodes accounting for 853.85s, 95.70% of 892.25s total
Dropped 1278 nodes (cum <= 4.46s)
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                           595.80s   100% |   
crypto/internal/bigmod.(*Nat).montgomeryMul
   595.80s 66.78% 66.78%    595.80s 66.78%                | 
crypto/internal/bigmod.addMulVVW2048
----------------------------------------------------------+-------------
                                           718.77s 93.03% |   
crypto/internal/bigmod.(*Nat).Exp
                                            47.75s  6.18% |   
crypto/internal/bigmod.(*Nat).ExpShort
                                             4.76s  0.62% |   
crypto/internal/bigmod.(*Nat).montgomeryRepresentation
                                             1.34s  0.17% |   
crypto/rsa.decrypt
    83.10s  9.31% 76.09%    772.62s 86.59%                | 
crypto/internal/bigmod.(*Nat).montgomeryMul
                                           595.80s 77.11% |   
crypto/internal/bigmod.addMulVVW2048
                                            42.93s  5.56% |   
crypto/internal/bigmod.addMulVVW (inline)
                                            36.25s  4.69% |   
crypto/internal/bigmod.(*Nat).maybeSubtractModulus
                                             6.19s   0.8% |   
crypto/internal/bigmod.(*Nat).reset (inline)
                                             4.19s  0.54% |   
runtime.duffzero
                                             3.54s  0.46% |   
runtime.memmove

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/206435a5-4434-4866-ba67-e79d384d7ce2n%40googlegroups.com.

Reply via email to