HI Mark,

following my previous reply, we have now confirmed that it's indeed 8.5.45
with APR 1.2.23 that's causing such high JVM CPU usage.
We used took out 2 out of 50 servers from the load balancer config,
reverted tomcat, and redeployed. With near to identical user traffic, the
two servers are responding normally without/without traffic with 8.5.41.
The JVM dump looks a lot better with 8.5.41.

We do think that the recent changes in APR and some other tomcat jar may
have caused compatibility issue on Windows server 2016 (64-bit) platform.
But unfortunately, we cannot pinpoint exactly what change may have caused
this (i.e. actual OS vs Security Updates). With this in mind, we are also
being wary to move to 8.5.47 as we don't know if the same issue will occur
again. Since 8.5.41 has been packaged with previously accepted application
installer, we are more comfortable rolling back.

I would appreciate if this can be looked into.

On Tue, 12 Nov 2019 at 11:27, M. Manna <manme...@gmail.com> wrote:

> Hey Mark (appreciate your response in US holiday time)
>
> On Tue, 12 Nov 2019 at 07:51, Mark Thomas <ma...@apache.org> wrote:
>
>> On November 12, 2019 12:54:53 AM UTC, "M. Manna" <manme...@gmail.com>
>> wrote:
>> >Just to give an update again:
>> >
>> >1) We reverted the APR to 1.2.21 - but observed no difference.
>> >2) We took 3 thread dumps over 1 min interval (without any user
>> >sessions) -
>> >All threads are tomcat's internal pool threads.
>> >
>> >When we checked the thread details (using fasthread.io) - we didn't see
>> >any
>> >of our application stack. Since there is no user traffic, this is
>> >coming
>> >from tomcat internally. At this stage, we cannot really figure out
>> >what's
>> >the root cause.
>> >
>> >Any help is appreciated.
>>
>> Migrated from what (full version info please)?
>>
>  from 8.5.41 to 8.5.45 (we migrate 3 times a year, last was in June)
>
>>
>> Operating system exact version?
>>
>  Microsoft Windows Server 2016 DataCentre (64-bit)
>
>>
>> JRE vendor and exact version?
>>
>  C:\jdk1.8.0\bin>java.exe -version
> java version "1.8.0_162"
> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
>
>
>> Do you see the same behavior with the latest 8.5.x and latest Tomcat
>> Native?
>>
>   We are using APR 1.2.23 which I can also see in latest tomcat. Due to
> production due diligence we cannot roll to a different version that easily.
> Normally, we lag behind by 2 monthly releases of tomcat. We also reverted
> the APR to 1.2.21 (but no difference).
>
>>
>> What triggers this behaviour?
>>
>  That is quite strange. Due to US holidays, we had a low traffic on our
> servers, and nothing has crept in to suggest that it's application-driven.
> We took one tomcat instance out of 50 instances and removed all user
> sessions (i.e. no application activities or threads). Upon restart of
> tomcat, the CPU spike lingered past the initial servlet startup period. We
> monitored that over 1-2 hours but there was no difference.
>
>>
>> How often do you see this behaviour?
>>
> We took 2 sets of data
> 1) 3 Jstack dump based on 10 seconds interval.
> 2) 3 jstack dump based on 1 min interval.
>
> Both the above reveals that all background threads (http, pool etc.) were
> from tomcat. We didn't have any application threads lingered in those 3
> samples. So yes we see this almost all the time if we take samples.
> However, when we compared with pre-production instances (with Windows
> server R2 x64 bit), we don't see such abnormal spike. In fact, the
> application instance doesn't incur such a big CPU spike. Whilst composing
> this email, I am now thinking if the APR is indeed incompatible with
> WIndows Server R2 (or the presence of any Windows Updates) which blocks the
> native poll() call longer than usual.
>
> An example is that on Windows Server 2012 - APR poll() call takes about
> 30% CPU time - but with Windows Server 2016 it's almost always 95%.
>
>
>>
>> And anything else you think might be relevant.
>>
>
> We are using end-2-end encryption using APR (with Certificate and
> SSLConfig resource setup in server.xml). But it's survived past 3 tomcat
> upgrades without any issue.
> Except OS we don't have any obvious culprit identified at the moment.
>
> Thanks,
>
>>
>> Mark
>>
>> >
>> >Thanks,
>> >
>> >On Mon, 11 Nov 2019 at 20:57, M. Manna <manme...@gmail.com> wrote:
>> >
>> >> Hello All,
>> >>
>> >> Any thoughts regarding this? Slightly clueless at this point, so any
>> >> direction will be appreciated.
>> >>
>> >> We are seeing the poll taking all the CPU time. We are using
>> >> OperatingSystemMXBean.getProcessCpuLoad() and
>> >> OperatingSystemMXBean.getSystemCpuLoad() to get our metrics (then
>> >x100 to
>> >> get the pct).
>> >>
>> >> Thanks,
>> >>
>> >>
>> >> On Mon, 11 Nov 2019 at 17:46, M. Manna <manme...@gmail.com> wrote:
>> >>
>> >>> Hello,
>> >>>
>> >>> after migrating to 8.5.45, we are seeing a lot of cpu load by
>> >following
>> >>> JVM thread dump:
>> >>>
>> >>> "https-openssl-apr-0.0.0.0-8443-Poller" : 102 : RUNNABLE :
>> >>> cpu=172902703125000 : cpuLoad= 74.181015
>> >>>
>> >>> BlockedCount:8464 BlockedTime:0 LockName:null LockOwnerID:-1
>> >>> LockOwnerName:null
>> >>>
>> >>> WaitedCount:5397 WaitedTime:0 InNative:false IsSuspended:false at
>> >>> org.apache.tomcat.jni.Poll.poll(Poll.java:-2)
>> >>>
>> >>>             at
>> >>>
>> >org.apache.tomcat.util.net.AprEndpoint$Poller.run(AprEndpoint.java:1547)
>> >>>
>> >>>             at java.lang.Thread.run(Thread.java:748)
>> >>>
>> >>>
>> >>> These are coming after 2-3 successful jvm dump. Is this something
>> >>> familiar to anybody?
>> >>>
>> >>> Thanks,
>> >>>
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
>>

Reply via email to