Re: High CPU usage with Solr 7.7.0
Thanks all. I pushed changes last night, this should be fixed in 7.7.2, 8.1 and master. Meanwhile, this is a trivial change to one line, so two ways to get by would be 1> just make the change yourself locally. Building Solr from scratch is actually not hard. The “ant package” target will get you the same thing you’d get from downloading the distribution. 2> use Java 9 or greater. Best, Erick > On Mar 25, 2019, at 1:58 AM, Lukas Weiss wrote: > > I forward this message. Thanks Adam. > > Hi, > Apologies, I can’t figure out how to reply to the Solr mailing list. > I just ran across the same high CPU usage issue. I believe it’’s caused by > this commit which was introduced in Solr 7.7.0 > https://github.com/apache/lucene-solr/commit/eb652b84edf441d8369f5188cdd5e3ae2b151434#diff-e54b251d166135a1afb7938cfe152bb5 > There is a bug in JDK versions <=8 where using 0 threads in the > ScheduledThreadPool causes high CPU usage: > https://bugs.openjdk.java.net/browse/JDK-8129861 > Oddly, the latest version > of solr/core/src/java/org/apache/solr/update/CommitTracker.java on > master still uses 0 executors as the default. Presumably most everyone is > using JDK 9 or greater which has the bug fixed, so they don’t experience > the bug. > Feel free to relay this back to the mailing list. > Thanks, > Adam Guthrie > > > > > > Von:"Lukas Weiss" > An: solr-user@lucene.apache.org, > Datum: 27.02.2019 11:13 > Betreff:High CPU usage with Solr 7.7.0 > > > > Hello, > > we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we > have problems with the server's CPU usage. > We have two Solr cores configured, but even if we clear all indexes and do > > not start the index process, we see 100 CPU usage for both cores. > > Here's what our top says: > > root@solr:~ # top > top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 > Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie > %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > 0,0 st > %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > 0,0 st > %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, > 0,0 st > %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, > 0,0 st > KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 > buff/cache > KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail Mem > > > > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > > P > 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 > 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 > > The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC > dedicated Container. > > Some more server info: > > root@solr:~ # java -version > openjdk version "1.8.0_181" > OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > > root@solr:~ # free -m > totalusedfree shared buff/cache > available > Mem: 8192 4847675 701 31 7675 > Swap: 2048 02048 > > We also found something strange if we do an strace of the main process, we > > get lots of ongoing connection timeouts: > > root@solr:~ # strace -F -p 4136 > strace: Process 4136 attached with 48 threads > strace: [ Process PID=11089 runs in x32 mode. ] > [pid 4937] epoll_wait(139, > [pid 4936] restart_syscall(<... resuming interrupted futex ...> > > [pid 4909] restart_syscall(<... resuming interrupted futex ...> > > [pid 4618] epoll_wait(136, > [pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL ...> > [pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL > > [pid 4244] restart_syscall(<... resuming interrupted futex ...> > > [pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL > > [pid 4243] restart_syscall(<... resuming interrupted futex ...> > > [pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL > > [pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL ...> > [pid 4205] restart_syscall(<... resuming interrupted futex ...> > > [pid 4204] restart_syscall(<... resuming interrupted futex ...> > > [pid 4196] restart_syscall(<... resuming interrupted futex ...> > > [pid 4195] restart_syscall(<... resuming interrupted futex ...> > > [pid 4194] restart_syscall(<... resuming interrupted futex ...> > > [pid 4193] restart_syscall(<... resuming
Re: High CPU usage with Solr 7.7.0
I forward this message. Thanks Adam. Hi, Apologies, I can’t figure out how to reply to the Solr mailing list. I just ran across the same high CPU usage issue. I believe it’’s caused by this commit which was introduced in Solr 7.7.0 https://github.com/apache/lucene-solr/commit/eb652b84edf441d8369f5188cdd5e3ae2b151434#diff-e54b251d166135a1afb7938cfe152bb5 There is a bug in JDK versions <=8 where using 0 threads in the ScheduledThreadPool causes high CPU usage: https://bugs.openjdk.java.net/browse/JDK-8129861 Oddly, the latest version of solr/core/src/java/org/apache/solr/update/CommitTracker.java on master still uses 0 executors as the default. Presumably most everyone is using JDK 9 or greater which has the bug fixed, so they don’t experience the bug. Feel free to relay this back to the mailing list. Thanks, Adam Guthrie Von:"Lukas Weiss" An: solr-user@lucene.apache.org, Datum: 27.02.2019 11:13 Betreff:High CPU usage with Solr 7.7.0 Hello, we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we have problems with the server's CPU usage. We have two Solr cores configured, but even if we clear all indexes and do not start the index process, we see 100 CPU usage for both cores. Here's what our top says: root@solr:~ # top top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, 0,0 st %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 buff/cache KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND P 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC dedicated Container. Some more server info: root@solr:~ # java -version openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) root@solr:~ # free -m totalusedfree shared buff/cache available Mem: 8192 4847675 701 31 7675 Swap: 2048 02048 We also found something strange if we do an strace of the main process, we get lots of ongoing connection timeouts: root@solr:~ # strace -F -p 4136 strace: Process 4136 attached with 48 threads strace: [ Process PID=11089 runs in x32 mode. ] [pid 4937] epoll_wait(139, [pid 4936] restart_syscall(<... resuming interrupted futex ...> [pid 4909] restart_syscall(<... resuming interrupted futex ...> [pid 4618] epoll_wait(136, [pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL [pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL [pid 4244] restart_syscall(<... resuming interrupted futex ...> [pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL [pid 4243] restart_syscall(<... resuming interrupted futex ...> [pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL [pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL [pid 4205] restart_syscall(<... resuming interrupted futex ...> [pid 4204] restart_syscall(<... resuming interrupted futex ...> [pid 4196] restart_syscall(<... resuming interrupted futex ...> [pid 4195] restart_syscall(<... resuming interrupted futex ...> [pid 4194] restart_syscall(<... resuming interrupted futex ...> [pid 4193] restart_syscall(<... resuming interrupted futex ...> [pid 4187] restart_syscall(<... resuming interrupted restart_syscall ...> [pid 4180] restart_syscall(<... resuming interrupted futex ...> [pid 4179] restart_syscall(<... resuming interrupted futex ...> [pid 4177] restart_syscall(<... resuming interrupted futex ...> [pid 4174] accept(133, [pid 4173] restart_syscall(<... resuming interrupted futex ...> [pid 4172] restart_syscall(<... resuming interrupted futex ...> [pid 4171] restart_syscall(<... resuming interrupted restart_syscall ...> [pid 4165] restart_syscall(<... resuming interrupted futex ...> [pid 4164] futex(0x7ff61c1f5054, FUTEX_WAIT_PRIVATE, 3, NULL [pid 4163] restart_syscall(<... resuming
Antwort: Re: Re: High CPU usage with Solr 7.7.0
) org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) java.lang.Thread.run(Thread.java:748) 75.6621ms 20.ms ShutdownMonitor (12) java.net.PlainSocketImpl.socketAccept(Native Method) java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) java.net.ServerSocket.implAccept(ServerSocket.java:545) java.net.ServerSocket.accept(ServerSocket.java:513) org.eclipse.jetty.server.ShutdownMonitor$ShutdownMonitorRunnable.run(ShutdownMonitor.java:335) java.lang.Thread.run(Thread.java:748) 0.3767ms 0.ms Signal Dispatcher (5) 0.0362ms 0.ms Finalizer (3) java.lang.ref.ReferenceQueue$Lock@448b0df5 java.lang.Object.wait(Native Method) java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216) 8.2488ms 0.ms Reference Handler (2) java.lang.ref.Reference$Lock@19ced464 java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) java.lang.ref.Reference.tryHandlePending(Reference.java:191) java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) Von:"Tomás Fernández Löbbe" An: solr-user@lucene.apache.org, Datum: 27.02.2019 19:34 Betreff: Re: Re: High CPU usage with Solr 7.7.0 Maybe a thread dump would be useful if you still have some instance running on 7.7 On Wed, Feb 27, 2019 at 7:28 AM Lukas Weiss wrote: > I can confirm this. Downgrading to 7.6.0 solved the issue. > Thanks for the hint. > > > > Von:"Joe Obernberger" > An: solr-user@lucene.apache.org, "Lukas Weiss" > , > Datum: 27.02.2019 15:59 > Betreff:Re: High CPU usage with Solr 7.7.0 > > > > Just to add to this. We upgraded to 7.7.0 and saw very large CPU usage > on multi core boxes - sustained in the 1200% range. We then switched to > 7.6.0 (no other configuration changes) and the problem went away. > > We have a 40 node cluster and all 40 nodes had high CPU usage with 3 > indexes stored on HDFS. > > -Joe > > On 2/27/2019 5:04 AM, Lukas Weiss wrote: > > Hello, > > > > we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we > > have problems with the server's CPU usage. > > We have two Solr cores configured, but even if we clear all indexes and > do > > not start the index process, we see 100 CPU usage for both cores. > > > > Here's what our top says: > > > > root@solr:~ # top > > top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 > > Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie > > %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > > 0,0 st > > %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > > 0,0 st > > %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, > > 0,0 st > > %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, > > 0,0 st > > KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 > > buff/cache > > KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail > Mem > > > > > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ > COMMAND > >P > > 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java > > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 > > 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java > > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 > > > > The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux > LXC > > dedicated Container. > > > > Some more server info: > > > > root@solr:~ # java -version > > openjdk version "1.8.0_181" > > OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) > > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > > > > root@solr:~ # free -m > >totalusedfree shared buff/cache > > available > > Mem: 8192 4847675 701 31 7675 > > Swap: 2048 02048 > > > > We also found something strange if we do an strace of the main process, > we > > get lots of ongoing connection timeouts: > > > > root@solr:~ # strace -F -p 4136 > > strace: Process 4136 attached with 48 threads > > strace: [ Process PID=11089 ru
Re: Re: High CPU usage with Solr 7.7.0
Maybe a thread dump would be useful if you still have some instance running on 7.7 On Wed, Feb 27, 2019 at 7:28 AM Lukas Weiss wrote: > I can confirm this. Downgrading to 7.6.0 solved the issue. > Thanks for the hint. > > > > Von:"Joe Obernberger" > An: solr-user@lucene.apache.org, "Lukas Weiss" > , > Datum: 27.02.2019 15:59 > Betreff:Re: High CPU usage with Solr 7.7.0 > > > > Just to add to this. We upgraded to 7.7.0 and saw very large CPU usage > on multi core boxes - sustained in the 1200% range. We then switched to > 7.6.0 (no other configuration changes) and the problem went away. > > We have a 40 node cluster and all 40 nodes had high CPU usage with 3 > indexes stored on HDFS. > > -Joe > > On 2/27/2019 5:04 AM, Lukas Weiss wrote: > > Hello, > > > > we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we > > have problems with the server's CPU usage. > > We have two Solr cores configured, but even if we clear all indexes and > do > > not start the index process, we see 100 CPU usage for both cores. > > > > Here's what our top says: > > > > root@solr:~ # top > > top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 > > Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie > > %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > > 0,0 st > > %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > > 0,0 st > > %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, > > 0,0 st > > %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, > > 0,0 st > > KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 > > buff/cache > > KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail > Mem > > > > > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ > COMMAND > >P > > 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java > > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 > > 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java > > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 > > > > The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux > LXC > > dedicated Container. > > > > Some more server info: > > > > root@solr:~ # java -version > > openjdk version "1.8.0_181" > > OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) > > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > > > > root@solr:~ # free -m > >totalusedfree shared buff/cache > > available > > Mem: 8192 4847675 701 31 7675 > > Swap: 2048 02048 > > > > We also found something strange if we do an strace of the main process, > we > > get lots of ongoing connection timeouts: > > > > root@solr:~ # strace -F -p 4136 > > strace: Process 4136 attached with 48 threads > > strace: [ Process PID=11089 runs in x32 mode. ] > > [pid 4937] epoll_wait(139, > > [pid 4936] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4909] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4618] epoll_wait(136, > > [pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL > > ...> > > [pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL > > > > [pid 4244] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL > > > > [pid 4243] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL > > > > [pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL > > ...> > > [pid 4205] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4204] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4196] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4195] restart_syscall(<... resuming interrupted futex ...> > > > > [pid 4194] r
Antwort: Re: High CPU usage with Solr 7.7.0
I can confirm this. Downgrading to 7.6.0 solved the issue. Thanks for the hint. Von:"Joe Obernberger" An: solr-user@lucene.apache.org, "Lukas Weiss" , Datum: 27.02.2019 15:59 Betreff: Re: High CPU usage with Solr 7.7.0 Just to add to this. We upgraded to 7.7.0 and saw very large CPU usage on multi core boxes - sustained in the 1200% range. We then switched to 7.6.0 (no other configuration changes) and the problem went away. We have a 40 node cluster and all 40 nodes had high CPU usage with 3 indexes stored on HDFS. -Joe On 2/27/2019 5:04 AM, Lukas Weiss wrote: > Hello, > > we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we > have problems with the server's CPU usage. > We have two Solr cores configured, but even if we clear all indexes and do > not start the index process, we see 100 CPU usage for both cores. > > Here's what our top says: > > root@solr:~ # top > top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 > Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie > %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > 0,0 st > %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, > 0,0 st > %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, > 0,0 st > %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, > 0,0 st > KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 > buff/cache > KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail Mem > > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND >P > 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 > 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java > -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 > -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 > -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 > > The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC > dedicated Container. > > Some more server info: > > root@solr:~ # java -version > openjdk version "1.8.0_181" > OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > > root@solr:~ # free -m >totalusedfree shared buff/cache > available > Mem: 8192 4847675 701 31 7675 > Swap: 2048 02048 > > We also found something strange if we do an strace of the main process, we > get lots of ongoing connection timeouts: > > root@solr:~ # strace -F -p 4136 > strace: Process 4136 attached with 48 threads > strace: [ Process PID=11089 runs in x32 mode. ] > [pid 4937] epoll_wait(139, > [pid 4936] restart_syscall(<... resuming interrupted futex ...> > > [pid 4909] restart_syscall(<... resuming interrupted futex ...> > > [pid 4618] epoll_wait(136, > [pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL ...> > [pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL > > [pid 4244] restart_syscall(<... resuming interrupted futex ...> > > [pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL > > [pid 4243] restart_syscall(<... resuming interrupted futex ...> > > [pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL > > [pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL ...> > [pid 4205] restart_syscall(<... resuming interrupted futex ...> > > [pid 4204] restart_syscall(<... resuming interrupted futex ...> > > [pid 4196] restart_syscall(<... resuming interrupted futex ...> > > [pid 4195] restart_syscall(<... resuming interrupted futex ...> > > [pid 4194] restart_syscall(<... resuming interrupted futex ...> > > [pid 4193] restart_syscall(<... resuming interrupted futex ...> > > [pid 4187] restart_syscall(<... resuming interrupted restart_syscall ...> > > [pid 4180] restart_syscall(<... resuming interrupted futex ...> > > [pid 4179] restart_syscall(<... resuming interrupted futex ...> > > [pid 4177] restart_syscall(<... resuming interrupted futex ...> > > [pid 4174] accept(133, > [pid 4173] restart_syscall(<... resuming interrupted futex ...> > > [pid 4172] restart_syscall(<... resuming interrupted futex ...> > > [pid 4171]
Re: High CPU usage with Solr 7.7.0
Just to add to this. We upgraded to 7.7.0 and saw very large CPU usage on multi core boxes - sustained in the 1200% range. We then switched to 7.6.0 (no other configuration changes) and the problem went away. We have a 40 node cluster and all 40 nodes had high CPU usage with 3 indexes stored on HDFS. -Joe On 2/27/2019 5:04 AM, Lukas Weiss wrote: Hello, we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we have problems with the server's CPU usage. We have two Solr cores configured, but even if we clear all indexes and do not start the index process, we see 100 CPU usage for both cores. Here's what our top says: root@solr:~ # top top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68 Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie %Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si, 0,0 st %Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem : 8388608 total, 7859168 free, 496744 used,32696 buff/cache KiB Swap: 2097152 total, 2097152 free,0 used. 7859168 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND P 10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java -server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25 The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC dedicated Container. Some more server info: root@solr:~ # java -version openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) root@solr:~ # free -m totalusedfree shared buff/cache available Mem: 8192 4847675 701 31 7675 Swap: 2048 02048 We also found something strange if we do an strace of the main process, we get lots of ongoing connection timeouts: root@solr:~ # strace -F -p 4136 strace: Process 4136 attached with 48 threads strace: [ Process PID=11089 runs in x32 mode. ] [pid 4937] epoll_wait(139, [pid 4936] restart_syscall(<... resuming interrupted futex ...> [pid 4909] restart_syscall(<... resuming interrupted futex ...> [pid 4618] epoll_wait(136, [pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL [pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL [pid 4244] restart_syscall(<... resuming interrupted futex ...> [pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL [pid 4243] restart_syscall(<... resuming interrupted futex ...> [pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL [pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL [pid 4205] restart_syscall(<... resuming interrupted futex ...> [pid 4204] restart_syscall(<... resuming interrupted futex ...> [pid 4196] restart_syscall(<... resuming interrupted futex ...> [pid 4195] restart_syscall(<... resuming interrupted futex ...> [pid 4194] restart_syscall(<... resuming interrupted futex ...> [pid 4193] restart_syscall(<... resuming interrupted futex ...> [pid 4187] restart_syscall(<... resuming interrupted restart_syscall ...> [pid 4180] restart_syscall(<... resuming interrupted futex ...> [pid 4179] restart_syscall(<... resuming interrupted futex ...> [pid 4177] restart_syscall(<... resuming interrupted futex ...> [pid 4174] accept(133, [pid 4173] restart_syscall(<... resuming interrupted futex ...> [pid 4172] restart_syscall(<... resuming interrupted futex ...> [pid 4171] restart_syscall(<... resuming interrupted restart_syscall ...> [pid 4165] restart_syscall(<... resuming interrupted futex ...> [pid 4164] futex(0x7ff61c1f5054, FUTEX_WAIT_PRIVATE, 3, NULL [pid 4163] restart_syscall(<... resuming interrupted futex ...> [pid 4162] restart_syscall(<... resuming interrupted futex ...> [pid 4161] restart_syscall(<... resuming interrupted futex ...> [pid 4160] futex(0x7ff623d52c20, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0x [pid 4159] futex(0x7ff61c1e9d54, FUTEX_WAIT_PRIVATE, 7, NULL [pid 4158] futex(0x7ff61c1b7f54, FUTEX_WAIT_PRIVATE, 15, NULL [pid 4157] futex(0x7ff61c1b5554, FUTEX_WAIT_PRIVATE, 19, NULL [pid 4156] restart_syscall(<... resuming interrupted futex ...> [pid 4155] restart_syscall(<... resuming interrupted futex ...> [pid 4153] futex(0x7ff61c06c754, FUTEX_WAIT_PRIVATE, 7, NULL [pid 4152]