[ https://issues.apache.org/jira/browse/JENA-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Bigerl updated JENA-2044: ----------------------------------- Attachment: fuseki-wikidata-2020-11-11_(docker_8c_J16g_R256g_K256g_phased).log > tdb2.tdbloader crashses loading wikidata > ---------------------------------------- > > Key: JENA-2044 > URL: https://issues.apache.org/jira/browse/JENA-2044 > Project: Apache Jena > Issue Type: Bug > Components: Cmd line tools, Jena, TDB2 > Affects Versions: Jena 3.14.0, Jena 3.17.0 > Environment: {code:bash} > $ java --version > openjdk 11.0.9.1 2020-11-04 > OpenJDK Runtime Environment (build 11.0.9.1+1-post-Debian-1deb10u2) > OpenJDK 64-Bit Server VM (build 11.0.9.1+1-post-Debian-1deb10u2, mixed mode, > sharing) > $ uname -r > 4.19.0-14-amd64 > $ lsb_release -da > No LSB modules are available. > Distributor ID: Debian > Description: Debian GNU/Linux 10 (buster) > Release: 10 > Codename: buster > {code} > Reporter: Alexander Bigerl > Priority: Major > Attachments: FusekiWikidataLoaderDockerfile, > fuseki-wikidata-2020-11-11_(docker_8c_J16g_R256g_K256g_phased).log, > hs_err_pid28709.log > > > Apache jena crashes when loading wikidata truthy 2020-11-11 (it is not > available any more via wikidata, but a backup can be found here: > [https://hobbitdata.informatik.uni-leipzig.de/wikidata-20201111-truthy-BETA.nt.bz2] > command run was: > {code:bash} > cgmemtime > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/triplestores/jena/apache-jena-3.17.0/bin/tdb2.tdbloader > --loader=sequential --loc > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/databases/fuseki/wikidata-2020-11-11/ > > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/datasets/wikidata-2020-11-11/wikidata-20201111-truthy-BETA.nt > 2>&1 | tee > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/logs/load/fuseki-wikidata-2020-11-11.log > {code} > > The end of the logfile is: > {code:bash} > 06:56:52 INFO loader :: Add: 729,000,000 SPO->OSP (Batch: 237,642 / > Avg: 300,050) > 06:56:57 INFO loader :: Add: 730,000,000 SPO->OSP (Batch: 234,962 / > Avg: 299,936) > 06:56:57 INFO loader :: Elapsed: 2,433.85 seconds [2021/02/12 > 06:56:57 CET] > 06:57:01 INFO loader :: Add: 731,000,000 SPO->OSP (Batch: 233,863 / > Avg: 299,820) > 06:57:05 INFO loader :: Add: 732,000,000 SPO->OSP (Batch: 269,978 / > Avg: 299,775) > 06:57:08 INFO loader :: Add: 733,000,000 SPO->OSP (Batch: 281,373 / > Avg: 299,748) > 06:57:12 INFO loader :: Add: 734,000,000 SPO->OSP (Batch: 285,143 / > Avg: 299,727) > 06:57:15 INFO loader :: Add: 735,000,000 SPO->OSP (Batch: 290,023 / > Avg: 299,714) > 06:57:19 INFO loader :: Add: 736,000,000 SPO->OSP (Batch: 290,951 / > Avg: 299,701) > # > # There is insufficient memory for the Java Runtime Environment to continue. > # Native memory allocation (malloc) failed to allocate 2097152 bytes for > AllocateHeap > # An error report file with more information is saved as: > # /home/d/dice-gr/profiles/unix/cs/triplestore-benchmark/hs_err_pid28709.log > Child user: 70961.063 s > Child sys : 5817.787 s > Child wall: 76243.666 s > Child high-water RSS : 534037652 KiB > Recursive and acc. high-water RSS+CACHE : 585081976 KiB > {code} > The machine has a AMD EPYC 7742 64-Core Processor, 1TB RAM and 2 TB free ssd > storage on /home. So there should still be plenty of RAM have been available. > The loc folder has at the time of crash 543GB. > I also tried with -loader=parallel and -loader=phased . Same result. > edit 23/Feb/2021: > ulimit -a > {code:bash} > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 4110480 > max locked memory (kbytes, -l) 65536 > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 4110480 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > {code} > output from prlimit --pid=<PID of tdb.loader2>: > {code:bash} > RESOURCE DESCRIPTION SOFT HARD UNITS > AS address space limit unlimited unlimited bytes > CORE max core file size 0 unlimited bytes > CPU CPU time unlimited unlimited seconds > DATA max data size unlimited unlimited bytes > FSIZE max file size unlimited unlimited bytes > LOCKS max number of file locks held unlimited unlimited locks > MEMLOCK max locked-in-memory address space 67108864 67108864 bytes > MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes > NICE max nice prio allowed to raise 0 0 > NOFILE max number of open files 1048576 1048576 files > NPROC max number of processes 4110480 4110480 processes > RSS max resident set size unlimited unlimited bytes > RTPRIO max real-time priority 0 0 > RTTIME timeout for real-time tasks unlimited unlimited microsecs > SIGPENDING max number of pending signals 4110480 4110480 signals > STACK max stack size 8388608 unlimited bytes > {code} > I tried several things. Nothing worked so far. Here is what I tried: > * Setting several JVM_ARGS to avoid resource over-consumption: > {code:bash} > -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 > -XX:ActiveProcessorCount=4 > {code} > * Limiting : > {code:bash} > -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 > -XX:ActiveProcessorCount=4 > {code} > * limiting kernel shared memory to (set in /etc/sysctl.conf + reboot) > {code:bash} > kernel.shmmax=17179869185 # 16 GB > kernel.shmall=4194304 # *4096 pagesize = 16 GB > {code} > * putting jena into a dockerfile and run it with the commands below. The idea > was to limit the resources and have a default environment. I first tried with > 16g memory but it within a couple 100M triples dropping below 10k triples/s > loading speed. So I increased the reseources because I didn't have that much > time on the server. Dockerfile is provided as attachment as well as the > output. Here the output might be interesting. Because tdb2.loader did not > crash completely but simply logged an Exception and idled ever since. > {code:bash} > docker build -f FusekiWikidataLoaderDockerfile -t fuseki-wikidata_01 > empty_folder/ > dockerun --user $(id -u):$(id -g) --kernel-memory=256g --memory=256g > --cpuset-cpus=0-7 --name fuseki_wikidata_8c_J16g_R256g_K256g_phased -v > "$(pwd)"/databases:/databases -v "$(pwd)"/datasets:/datasets -v > "$(pwd)"/logs:/logs fuseki-wikidata_01:latest > {code} > Side observation: > * Fusekis loading performance seems to be influenced by linux' cache > significantly. If there is 1TB available it had had a read rate of ~2.5GB/s > (measured with htop). With only 16GB it dropped to at max 250MB/s. -- This message was sent by Atlassian Jira (v8.3.4#803005)