On Tue, Feb 23, 2021 at 4:07 AM Dan Zheng <zdan0...@gmail.com> wrote:
> 1. Environment > OS: CentOS Linux release 7.8.2003 (Core) > > JDK: java version "1.8.0_181" > Java(TM) SE Runtime Environment (build 1.8.0_181-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode) > > Tomcat: tomcat-embed-core-8.5.32 with spring-boot-2.0.4-RELEASE > > 2. Reproduce Steps > > 1. request large data download api, then request 2000+ request to > another lightweight api > 2. the system now is Full GC, and the 2000+ request will blocked, then > close all these request > 3. after system throw OufOfMemoryError, the memory will be released, the > cpu and memory occupation is normal, the system is ok to visit mysql > database and execute the schedule job > 4. request any api, the response is always slow, and too may close_wait > [image: image.png] > > 3. Problem Shooting > a) I check the thread with jstack, ClientPoller in NioEndpoint, > BlockPoller in NioBlockingSelector are both in > > "http-nio-8043-ClientPoller-0" #83 daemon prio=5 os_prio=0 > tid=0x00007faf58b50000 nid=0x4a82 runnable [0x00007faefc8d7000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > - locked <0x00000000e3c8b478> (a sun.nio.ch.Util$3) > - locked <0x00000000e3c8b468> (a > java.util.Collections$UnmodifiableSet) > - locked <0x00000000e3c8b330> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at > org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:798) > at java.lang.Thread.run(Thread.java:748) > > Locked ownable synchronizers: > - None > > ================================================================================ > "NioBlockingSelector.BlockPoller-1" #72 daemon prio=5 os_prio=0 > tid=0x00007faf591fa800 nid=0x4a77 runnable [0x00007faefd3e2000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > - locked <0x00000000e3c8bb78> (a sun.nio.ch.Util$3) > - locked <0x00000000e3c8bb68> (a > java.util.Collections$UnmodifiableSet) > - locked <0x00000000e3c8ba40> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at > org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298) > > Locked ownable synchronizers: > > ================================================================================ > > b) arthas check https://github.com/alibaba/arthas > i hook the process to see what are the two poller select(selectorTimeout) > return, > > the keyCount = 0, but the read buffer have 191 byte to read, why epollWait > always return keyCount = 0? > > the expected behavior is, tomcat can read the data from buffer and the > close the socket successfully > > c) test with another method > I change the protocol to "Http11Nio2Protocol", and the close_wait will be > recycled > > I set java.nio.channels.spi.SelectorProvider > with PollSelectorProvider, the close_wait will be recycled too > As you found out, this is a JVM bug and there are workarounds if you experience it: https://bz.apache.org/bugzilla/show_bug.cgi?id=63802 Rémy