Just some additional notes for anyone who might stumble upon this same issue in the future.
I did a bunch more testing with this and confirmed that it was indeed net.ipv4.tcp_mem that needed to be tweaked. I set the r/wmem settings back to the defaults and ultimately set tcp_mem to "512000 512000 512000". The reason this was so hard to find is because the kernel doesn't report anything when it enters a "memory pressure state." I would have never known to even check this, especially since most sites say not to touch this parameter! Plus, older kernels didn't work like this, so I'm not exactly sure what changed. You can check current TCP memory usage like this: # cat /proc/net/sockstat sockets: used 5392 TCP: inuse 7214 orphan 1927 tw 14811 alloc 7217 mem 317331 The mem value is in pages and directly corresponds to tcp_mem. Once the middle value of tcp_mem is reached (pressure), the kernel enters "memory pressure mode." I can't find any documentation as to what this means exactly, only hints here and there. For example, if I set tcp_mem back to the defaults of '94401 125868 188802' it quickly reaches the pressure value (125868) and starts to limit r/w buffer sizes (I think) so that mem value comes down pretty quickly to <125K (which of course throttles downloads considerably). Once I set tcp_mem back up to 512K, the mem value increases back to over 300K almost instantly. It appears that the best way to determine if tcp_mem needs to be adjusted is to check /proc/net/sockstat and see if the TCP mem value is anywhere close to tcp_mem's middle / pressure value. If it is, try setting tcp_mem to a much higher value and see if /proc/net/sockstat TCP mem increases drastically. If it does, that means tcp_mem needs to be set higher. Since these VMs have 8GB of memory, I set the tcp_mem to a more reasonable value of 512000 pages or about 2GB. I like setting the three values to the same number because then the kernel won't silently enter memory pressure mode and throttle connections. Once the tcp_mem max is reached, the kernel reports this: TCP: out of memory -- consider tuning tcp_mem Which I can then alert on. I left the other values at their defaults. Everything appears to be working as expected now. -- Brendon Colby Senior DevOps Engineer Newgrounds.com

