On Tue, Feb 23, 2021 at 4:07 AM Dan Zheng <zdan0...@gmail.com> wrote:

> 1. Environment
> OS: CentOS Linux release 7.8.2003 (Core)
>
> JDK: java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
>
> Tomcat: tomcat-embed-core-8.5.32 with spring-boot-2.0.4-RELEASE
>
> 2. Reproduce Steps
>
> 1. request large data download api, then request 2000+ request to
> another lightweight api
> 2. the system now is Full GC, and the 2000+ request will blocked, then
> close all these request
> 3. after system throw OufOfMemoryError, the memory will be released, the
> cpu and memory  occupation is normal, the system is ok to visit mysql
> database and execute the schedule job
> 4. request any api, the response is always slow, and too may close_wait
> [image: image.png]
>
> 3. Problem Shooting
> a) I check the thread with jstack, ClientPoller in NioEndpoint,
> BlockPoller in NioBlockingSelector are both in
>
> "http-nio-8043-ClientPoller-0" #83 daemon prio=5 os_prio=0
> tid=0x00007faf58b50000 nid=0x4a82 runnable [0x00007faefc8d7000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000000e3c8b478> (a sun.nio.ch.Util$3)
>         - locked <0x00000000e3c8b468> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x00000000e3c8b330> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at
> org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:798)
>         at java.lang.Thread.run(Thread.java:748)
>
>    Locked ownable synchronizers:
>         - None
>
> ================================================================================
> "NioBlockingSelector.BlockPoller-1" #72 daemon prio=5 os_prio=0
> tid=0x00007faf591fa800 nid=0x4a77 runnable [0x00007faefd3e2000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000000e3c8bb78> (a sun.nio.ch.Util$3)
>         - locked <0x00000000e3c8bb68> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x00000000e3c8ba40> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at
> org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298)
>
>    Locked ownable synchronizers:
>
> ================================================================================
>
> b) arthas check https://github.com/alibaba/arthas
> i hook the process to see what are the two poller select(selectorTimeout)
> return,
>
> the keyCount = 0, but the read buffer have 191 byte to read, why epollWait
> always return keyCount = 0?
>
> the expected behavior is, tomcat can read the data from buffer and the
> close the socket successfully
>
> c) test with another method
> I change the protocol to "Http11Nio2Protocol", and the close_wait will be
> recycled
>
> I set java.nio.channels.spi.SelectorProvider
> with PollSelectorProvider, the close_wait will be recycled too
>

As you found out, this is a JVM bug and there are workarounds if you
experience it:
https://bz.apache.org/bugzilla/show_bug.cgi?id=63802

Rémy

Reply via email to