Hi, Just a kind ping on this ticket. I am adding a few clarifications inline.
On 21 Oct 2023 at 00:10:34, Cristian Zamfir <[email protected]> wrote: > Hello! > > I have been using the tika docker image pretty much out of the box so far > and I am puzzled by an OOM issue that has been going on for a while now: > despite quite conservative memory limits given to the JVM in terms of both > heap and total max memory, containers still crash with OOM. > These are the settings I am using inside containers capped at 6GB of > memory using tika server with the tika watchdog config: > > > <forkedJvmArgs> > <arg>-Xmx3g</arg> > <arg>-Dlog4j.configurationFile=log4j2.xml</arg> > <arg>-XX:+UseContainerSupport</arg> > <arg>-XX:+UnlockExperimentalVMOptions</arg> > <arg>-XX:MaxRAMPercentage=30</arg> > </forkedJvmArgs> > > > My preliminary conclusion is that the jvm is not able to enforce these flags 100% of the time quickly enough before the cgroup limits kick in and the kernel oom kicks in. Did anyone else experience this., > With these settings, the JVM quite often deals well with terminating > processes that hit the memory cap and the watchdog restarts them: > > [pool-2-thread-1] 21:38:26,395 org.apache.tika.server.core.TikaServerWatchDog > forked process exited with exit value 137 > > > However, from time to time, the JVM seems to not be able to deal with it, > the OS kicks in and the container is killed with OOM. My only explanation > so far is that the JVM is too slow to kill the forked process and the > memory usage blows up quite quickly. You can see below how the total-vm > values are close to 6GB at OOM time. This does not make sense IMO, the JVM > should kill these processes way before reaching the e.g., 5613608kB value, > actually the forked process should not exceed 1.8GB if we take into account > at MaxRAMPercentage. > > Another puzzling fact is that the anon + file RSS do not really add up to > the total-vm size, so I am guessing that this is not actually due to heap. > Could this be caused by some native code? > > > dmesg -T | grep "Killed process" > > [Fri Oct 20 21:14:13 2023] Memory cgroup out of memory: Killed process 109549 > (java) total-vm:5632740kB, anon-rss:1036696kB, file-rss:24668kB, > shmem-rss:0kB, UID:35002 pgtables:2532kB oom_score_adj:-997 > [Fri Oct 20 21:14:27 2023] Memory cgroup out of memory: Killed process 109713 > (java) total-vm:5613608kB, anon-rss:1029280kB, file-rss:24380kB, > shmem-rss:0kB, UID:35002 pgtables:2456kB oom_score_adj:-997 > [Fri Oct 20 21:14:34 2023] Memory cgroup out of memory: Killed process 109839 > (java) total-vm:5607392kB, anon-rss:976664kB, file-rss:24116kB, > shmem-rss:0kB, UID:35002 pgtables:2336kB oom_score_adj:-997 > [Fri Oct 20 21:14:52 2023] Memory cgroup out of memory: Killed process 109970 > (java) total-vm:5598332kB, anon-rss:954312kB, file-rss:24592kB, > shmem-rss:0kB, UID:35002 pgtables:2272kB oom_score_adj:-997 > [Fri Oct 20 21:15:19 2023] Memory cgroup out of memory: Killed process 110089 > (java) total-vm:5615776kB, anon-rss:946484kB, file-rss:24672kB, > shmem-rss:0kB, UID:35002 pgtables:2280kB oom_score_adj:-997 > [Fri Oct 20 21:15:29 2023] Memory cgroup out of memory: Killed process 110269 > (java) total-vm:5602004kB, anon-rss:948548kB, file-rss:24412kB, > shmem-rss:0kB, UID:35002 pgtables:2280kB oom_score_adj:-997 > [Fri Oct 20 21:15:42 2023] Memory cgroup out of memory: Killed process 110367 > (java) total-vm:5607104kB, anon-rss:942636kB, file-rss:24524kB, > shmem-rss:0kB, UID:35002 pgtables:2284kB oom_score_adj:-997 > [Fri Oct 20 21:16:07 2023] Memory cgroup out of memory: Killed process 110464 > (java) total-vm:5593792kB, anon-rss:940524kB, file-rss:24712kB, > shmem-rss:0kB, UID:35002 pgtables:2216kB oom_score_adj:-997 > [Fri Oct 20 21:16:17 2023] Memory cgroup out of memory: Killed process 110684 > (java) total-vm:5627620kB, anon-rss:910000kB, file-rss:24340kB, > shmem-rss:0kB, UID:35002 pgtables:2224kB oom_score_adj:-997 > [Fri Oct 20 21:16:25 2023] Memory cgroup out of memory: Killed process 110798 > (java) total-vm:5616588kB, anon-rss:889436kB, file-rss:24500kB, > shmem-rss:0kB, UID:35002 pgtables:2216kB oom_score_adj:-997 > [Fri Oct 20 21:16:31 2023] Memory cgroup out of memory: Killed process 110939 > (java) total-vm:5619708kB, anon-rss:839724kB, file-rss:23796kB, > shmem-rss:0kB, UID:35002 pgtables:2100kB oom_score_adj:-997 > [Fri Oct 20 21:16:43 2023] Memory cgroup out of memory: Killed process 111042 > (java) total-vm:5601976kB, anon-rss:807116kB, file-rss:24420kB, > shmem-rss:0kB, UID:35002 pgtables:2000kB oom_score_adj:-997 > [Fri Oct 20 21:17:03 2023] Memory cgroup out of memory: Killed process 111165 > (java) total-vm:5599008kB, anon-rss:792704kB, file-rss:24724kB, > shmem-rss:0kB, UID:35002 pgtables:1944kB oom_score_adj:-997 > [Fri Oct 20 21:17:09 2023] Memory cgroup out of memory: Killed process 111317 > (java) total-vm:5612224kB, anon-rss:767304kB, file-rss:24400kB, > shmem-rss:0kB, UID:35002 pgtables:1984kB oom_score_adj:-997 > [Fri Oct 20 21:17:16 2023] Memory cgroup out of memory: Killed process 111427 > (java) total-vm:5613572kB, anon-rss:739720kB, file-rss:24196kB, > shmem-rss:0kB, UID:35002 pgtables:1892kB oom_score_adj:-997 > [Fri Oct 20 21:17:28 2023] Memory cgroup out of memory: Killed process 111525 > (java) total-vm:5603008kB, anon-rss:737940kB, file-rss:24796kB, > shmem-rss:0kB, UID:35002 pgtables:1860kB oom_score_adj:-997 > [Fri Oct 20 21:17:36 2023] Memory cgroup out of memory: Killed process 111620 > (java) total-vm:5602048kB, anon-rss:728384kB, file-rss:24480kB, > shmem-rss:0kB, UID:35002 pgtables:1828kB oom_score_adj:-997 > [Fri Oct 20 21:17:43 2023] Memory cgroup out of memory: Killed process 111711 > (java) total-vm:5601984kB, anon-rss:710832kB, file-rss:24648kB, > shmem-rss:0kB, UID:35002 pgtables:1804kB oom_score_adj:-997 > [Fri Oct 20 21:17:55 2023] Memory cgroup out of memory: Killed process 111776 > (java) total-vm:5594816kB, anon-rss:709584kB, file-rss:24444kB, > shmem-rss:0kB, UID:35002 pgtables:1824kB oom_score_adj:-997 > > > > I guess my question is if I am missing something that explains this and I > could configure tika-server to preempt this issue. > > > Going forward however, I realize that I need to set up the following 3, > and I have a question for each: > > 1. concurrency control to avoid overwhelming tika-sever (seems like I > could only control concurrency on the sender side since tika server does > not provide a way to limit the number of concurrent request). Is that > correct? > > AFAIU previously this is not possible except if we move to Tika pipes. I just wanted to check if that is accurate. > 1. request isolation to avoid that a single file brings down an entire > instance -> is the only recommended solution to use tika pipes? > > Is there a plan to implement isolation between requests in Tika standalone server? > 1. implement timeouts and memory limits per request, to avoid that a > single request can go haywire and use too much CPU and/or memory -> is > there a way to configure this already and maybe I missed it? > > I remember seeing somewhere on the list a timeout per request, but cannot find it now. Thanks! Cristi > Thanks! I realize these are a lot of questions 🙂 > > Cristi > > >
