[ https://issues.apache.org/jira/browse/JENA-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Bigerl updated JENA-2044: ----------------------------------- Description: Apache jena crashes when loading wikidata truthy 2020-11-11 (it is not available any more via wikidata, but a backup can be found here: [https://hobbitdata.informatik.uni-leipzig.de/wikidata-20201111-truthy-BETA.nt.bz2] command run was: {code:bash} cgmemtime /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/triplestores/jena/apache-jena-3.17.0/bin/tdb2.tdbloader --loader=sequential --loc /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/databases/fuseki/wikidata-2020-11-11/ /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/datasets/wikidata-2020-11-11/wikidata-20201111-truthy-BETA.nt 2>&1 | tee /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/logs/load/fuseki-wikidata-2020-11-11.log {code} The end of the logfile is: {code:bash} 06:56:52 INFO loader :: Add: 729,000,000 SPO->OSP (Batch: 237,642 / Avg: 300,050) 06:56:57 INFO loader :: Add: 730,000,000 SPO->OSP (Batch: 234,962 / Avg: 299,936) 06:56:57 INFO loader :: Elapsed: 2,433.85 seconds [2021/02/12 06:56:57 CET] 06:57:01 INFO loader :: Add: 731,000,000 SPO->OSP (Batch: 233,863 / Avg: 299,820) 06:57:05 INFO loader :: Add: 732,000,000 SPO->OSP (Batch: 269,978 / Avg: 299,775) 06:57:08 INFO loader :: Add: 733,000,000 SPO->OSP (Batch: 281,373 / Avg: 299,748) 06:57:12 INFO loader :: Add: 734,000,000 SPO->OSP (Batch: 285,143 / Avg: 299,727) 06:57:15 INFO loader :: Add: 735,000,000 SPO->OSP (Batch: 290,023 / Avg: 299,714) 06:57:19 INFO loader :: Add: 736,000,000 SPO->OSP (Batch: 290,951 / Avg: 299,701) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 2097152 bytes for AllocateHeap # An error report file with more information is saved as: # /home/d/dice-gr/profiles/unix/cs/triplestore-benchmark/hs_err_pid28709.log Child user: 70961.063 s Child sys : 5817.787 s Child wall: 76243.666 s Child high-water RSS : 534037652 KiB Recursive and acc. high-water RSS+CACHE : 585081976 KiB {code} The machine has a AMD EPYC 7742 64-Core Processor, 1TB RAM and 2 TB free ssd storage on /home. So there should still be plenty of RAM have been available. The loc folder has at the time of crash 543GB. I also tried with -loader=parallel and -loader=phased . Same result. edit 23/Feb/2021: ulimit -a {code:bash} core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 4110480 max locked memory (kbytes, -l) 65536 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 4110480 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited {code} output from prlimit --pid=<PID of tdb.loader2>: {code:bash} RESOURCE DESCRIPTION SOFT HARD UNITS AS address space limit unlimited unlimited bytes CORE max core file size 0 unlimited bytes CPU CPU time unlimited unlimited seconds DATA max data size unlimited unlimited bytes FSIZE max file size unlimited unlimited bytes LOCKS max number of file locks held unlimited unlimited locks MEMLOCK max locked-in-memory address space 67108864 67108864 bytes MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes NICE max nice prio allowed to raise 0 0 NOFILE max number of open files 1048576 1048576 files NPROC max number of processes 4110480 4110480 processes RSS max resident set size unlimited unlimited bytes RTPRIO max real-time priority 0 0 RTTIME timeout for real-time tasks unlimited unlimited microsecs SIGPENDING max number of pending signals 4110480 4110480 signals STACK max stack size 8388608 unlimited bytes {code} I tried several things. Nothing worked so far. Here is what I tried: * Setting several JVM_ARGS to avoid resource over-consumption: {code:bash} -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 -XX:ActiveProcessorCount=4 {code} * Setting limits for Metaspace and CodeCache: {code:bash} -XX:MaxMetaspaceSize=4g -XX:ReservedCodeCacheSize=4g {code} * limiting kernel shared memory to (set in /etc/sysctl.conf + reboot) {code:bash} kernel.shmmax=17179869185 # 16 GB kernel.shmall=4194304 # *4096 pagesize = 16 GB {code} * putting jena into a dockerfile and run it with the commands below. The idea was to limit the resources and have a default environment. I first tried with 16g memory but it within a couple 100M triples dropping below 10k triples/s loading speed. So I increased the reseources because I didn't have that much time on the server. Dockerfile is provided as attachment as well as the output. Here the output might be interesting. Because tdb2.loader did not crash completely but simply logged an Exception and idled ever since. {code:bash} docker build -f FusekiWikidataLoaderDockerfile -t fuseki-wikidata_01 empty_folder/ dockerun --user $(id -u):$(id -g) --kernel-memory=256g --memory=256g --cpuset-cpus=0-7 --name fuseki_wikidata_8c_J16g_R256g_K256g_phased -v "$(pwd)"/databases:/databases -v "$(pwd)"/datasets:/datasets -v "$(pwd)"/logs:/logs fuseki-wikidata_01:latest {code} error: {code:bash} 05:27:31 INFO loader :: Add: 3,156,000,000 Index (Batch: 57,074 / Avg: 159,787) 05:27:35 ERROR Indexer :: Interrupted org.apache.jena.dboe.base.file.FileException: BlockMgrMapped.segmentAllocate: Segment = 16313 : Offset = 136843362304 at org.apache.jena.dboe.base.file.BlockAccessMapped.allocSegment(BlockAccessMapped.java:218) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.file.BlockAccessMapped.getByteBuffer(BlockAccessMapped.java:150) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.file.BlockAccessMapped.allocate(BlockAccessMapped.java:80) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocate(BlockMgrFileAccess.java:51) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrBase.allocate(BlockMgrBase.java:39) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrWrapper.allocate(BlockMgrWrapper.java:41) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrFreeChain.allocate(BlockMgrFreeChain.java:44) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.page.PageBlockMgr.create(PageBlockMgr.java:53) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecordsMgr.create(BPTreeRecordsMgr.java:77) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.create(BPTreeRecords.java:236) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.insertNewPage(BPTreeRecords.java:230) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.split(BPTreeRecords.java:206) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:568) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:134) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:116) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:93) ~[jena-tdb2-3.17.0.jar:3.17.0] at java.lang.Thread.run(Thread.java:834) [?:?] [53484.193s][warning][os,thread] Attempt to deallocate stack guard pages failed (0x00007f33244ff000-0x00007f3324503000). 05:27:36 ERROR Indexer :: Interrupted org.apache.jena.dboe.base.file.FileException: BlockMgrMapped.segmentAllocate: Segment = 16459 : Offset = 138068099072 at org.apache.jena.dboe.base.file.BlockAccessMapped.allocSegment(BlockAccessMapped.java:218) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.file.BlockAccessMapped.getByteBuffer(BlockAccessMapped.java:150) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.file.BlockAccessMapped.allocate(BlockAccessMapped.java:80) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocate(BlockMgrFileAccess.java:51) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrBase.allocate(BlockMgrBase.java:39) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrWrapper.allocate(BlockMgrWrapper.java:41) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.block.BlockMgrFreeChain.allocate(BlockMgrFreeChain.java:44) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.base.page.PageBlockMgr.create(PageBlockMgr.java:53) ~[jena-dboe-base-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecordsMgr.create(BPTreeRecordsMgr.java:77) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.create(BPTreeRecords.java:236) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.insertNewPage(BPTreeRecords.java:230) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeRecords.split(BPTreeRecords.java:206) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:568) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:134) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:116) ~[jena-tdb2-3.17.0.jar:3.17.0] at org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:93) ~[jena-tdb2-3.17.0.jar:3.17.0] at java.lang.Thread.run(Thread.java:834) [?:?] [53485.632s][warning][os,thread] Attempt to deallocate stack guard pages failed (0x00007f33242fd000-0x00007f3324301000). {code} Side observation: * Fusekis loading performance seems to be influenced by linux' cache significantly. If there is 1TB available it had had a read rate of ~2.5GB/s (measured with htop). With only 16GB it dropped to at max 250MB/s. was: Apache jena crashes when loading wikidata truthy 2020-11-11 (it is not available any more via wikidata, but a backup can be found here: [https://hobbitdata.informatik.uni-leipzig.de/wikidata-20201111-truthy-BETA.nt.bz2] command run was: {code:bash} cgmemtime /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/triplestores/jena/apache-jena-3.17.0/bin/tdb2.tdbloader --loader=sequential --loc /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/databases/fuseki/wikidata-2020-11-11/ /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/datasets/wikidata-2020-11-11/wikidata-20201111-truthy-BETA.nt 2>&1 | tee /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/logs/load/fuseki-wikidata-2020-11-11.log {code} The end of the logfile is: {code:bash} 06:56:52 INFO loader :: Add: 729,000,000 SPO->OSP (Batch: 237,642 / Avg: 300,050) 06:56:57 INFO loader :: Add: 730,000,000 SPO->OSP (Batch: 234,962 / Avg: 299,936) 06:56:57 INFO loader :: Elapsed: 2,433.85 seconds [2021/02/12 06:56:57 CET] 06:57:01 INFO loader :: Add: 731,000,000 SPO->OSP (Batch: 233,863 / Avg: 299,820) 06:57:05 INFO loader :: Add: 732,000,000 SPO->OSP (Batch: 269,978 / Avg: 299,775) 06:57:08 INFO loader :: Add: 733,000,000 SPO->OSP (Batch: 281,373 / Avg: 299,748) 06:57:12 INFO loader :: Add: 734,000,000 SPO->OSP (Batch: 285,143 / Avg: 299,727) 06:57:15 INFO loader :: Add: 735,000,000 SPO->OSP (Batch: 290,023 / Avg: 299,714) 06:57:19 INFO loader :: Add: 736,000,000 SPO->OSP (Batch: 290,951 / Avg: 299,701) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 2097152 bytes for AllocateHeap # An error report file with more information is saved as: # /home/d/dice-gr/profiles/unix/cs/triplestore-benchmark/hs_err_pid28709.log Child user: 70961.063 s Child sys : 5817.787 s Child wall: 76243.666 s Child high-water RSS : 534037652 KiB Recursive and acc. high-water RSS+CACHE : 585081976 KiB {code} The machine has a AMD EPYC 7742 64-Core Processor, 1TB RAM and 2 TB free ssd storage on /home. So there should still be plenty of RAM have been available. The loc folder has at the time of crash 543GB. I also tried with -loader=parallel and -loader=phased . Same result. edit 23/Feb/2021: ulimit -a {code:bash} core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 4110480 max locked memory (kbytes, -l) 65536 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 4110480 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited {code} output from prlimit --pid=<PID of tdb.loader2>: {code:bash} RESOURCE DESCRIPTION SOFT HARD UNITS AS address space limit unlimited unlimited bytes CORE max core file size 0 unlimited bytes CPU CPU time unlimited unlimited seconds DATA max data size unlimited unlimited bytes FSIZE max file size unlimited unlimited bytes LOCKS max number of file locks held unlimited unlimited locks MEMLOCK max locked-in-memory address space 67108864 67108864 bytes MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes NICE max nice prio allowed to raise 0 0 NOFILE max number of open files 1048576 1048576 files NPROC max number of processes 4110480 4110480 processes RSS max resident set size unlimited unlimited bytes RTPRIO max real-time priority 0 0 RTTIME timeout for real-time tasks unlimited unlimited microsecs SIGPENDING max number of pending signals 4110480 4110480 signals STACK max stack size 8388608 unlimited bytes {code} I tried several things. Nothing worked so far. Here is what I tried: * Setting several JVM_ARGS to avoid resource over-consumption: {code:bash} -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 -XX:ActiveProcessorCount=4 {code} * Limiting : {code:bash} -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 -XX:ActiveProcessorCount=4 {code} * limiting kernel shared memory to (set in /etc/sysctl.conf + reboot) {code:bash} kernel.shmmax=17179869185 # 16 GB kernel.shmall=4194304 # *4096 pagesize = 16 GB {code} * putting jena into a dockerfile and run it with the commands below. The idea was to limit the resources and have a default environment. I first tried with 16g memory but it within a couple 100M triples dropping below 10k triples/s loading speed. So I increased the reseources because I didn't have that much time on the server. Dockerfile is provided as attachment as well as the output. Here the output might be interesting. Because tdb2.loader did not crash completely but simply logged an Exception and idled ever since. {code:bash} docker build -f FusekiWikidataLoaderDockerfile -t fuseki-wikidata_01 empty_folder/ dockerun --user $(id -u):$(id -g) --kernel-memory=256g --memory=256g --cpuset-cpus=0-7 --name fuseki_wikidata_8c_J16g_R256g_K256g_phased -v "$(pwd)"/databases:/databases -v "$(pwd)"/datasets:/datasets -v "$(pwd)"/logs:/logs fuseki-wikidata_01:latest {code} Side observation: * Fusekis loading performance seems to be influenced by linux' cache significantly. If there is 1TB available it had had a read rate of ~2.5GB/s (measured with htop). With only 16GB it dropped to at max 250MB/s. > tdb2.tdbloader crashses loading wikidata > ---------------------------------------- > > Key: JENA-2044 > URL: https://issues.apache.org/jira/browse/JENA-2044 > Project: Apache Jena > Issue Type: Bug > Components: Cmd line tools, Jena, TDB2 > Affects Versions: Jena 3.14.0, Jena 3.17.0 > Environment: {code:bash} > $ java --version > openjdk 11.0.9.1 2020-11-04 > OpenJDK Runtime Environment (build 11.0.9.1+1-post-Debian-1deb10u2) > OpenJDK 64-Bit Server VM (build 11.0.9.1+1-post-Debian-1deb10u2, mixed mode, > sharing) > $ uname -r > 4.19.0-14-amd64 > $ lsb_release -da > No LSB modules are available. > Distributor ID: Debian > Description: Debian GNU/Linux 10 (buster) > Release: 10 > Codename: buster > {code} > Reporter: Alexander Bigerl > Priority: Major > Attachments: FusekiWikidataLoaderDockerfile, > fuseki-wikidata-2020-11-11_(docker_8c_J16g_R256g_K256g_phased).log, > hs_err_pid28709.log > > > Apache jena crashes when loading wikidata truthy 2020-11-11 (it is not > available any more via wikidata, but a backup can be found here: > [https://hobbitdata.informatik.uni-leipzig.de/wikidata-20201111-truthy-BETA.nt.bz2] > command run was: > {code:bash} > cgmemtime > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/triplestores/jena/apache-jena-3.17.0/bin/tdb2.tdbloader > --loader=sequential --loc > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/databases/fuseki/wikidata-2020-11-11/ > > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/datasets/wikidata-2020-11-11/wikidata-20201111-truthy-BETA.nt > 2>&1 | tee > /upb/users/d/dice-gr/profiles/unix/cs/triplestore-benchmark/logs/load/fuseki-wikidata-2020-11-11.log > {code} > > The end of the logfile is: > {code:bash} > 06:56:52 INFO loader :: Add: 729,000,000 SPO->OSP (Batch: 237,642 / > Avg: 300,050) > 06:56:57 INFO loader :: Add: 730,000,000 SPO->OSP (Batch: 234,962 / > Avg: 299,936) > 06:56:57 INFO loader :: Elapsed: 2,433.85 seconds [2021/02/12 > 06:56:57 CET] > 06:57:01 INFO loader :: Add: 731,000,000 SPO->OSP (Batch: 233,863 / > Avg: 299,820) > 06:57:05 INFO loader :: Add: 732,000,000 SPO->OSP (Batch: 269,978 / > Avg: 299,775) > 06:57:08 INFO loader :: Add: 733,000,000 SPO->OSP (Batch: 281,373 / > Avg: 299,748) > 06:57:12 INFO loader :: Add: 734,000,000 SPO->OSP (Batch: 285,143 / > Avg: 299,727) > 06:57:15 INFO loader :: Add: 735,000,000 SPO->OSP (Batch: 290,023 / > Avg: 299,714) > 06:57:19 INFO loader :: Add: 736,000,000 SPO->OSP (Batch: 290,951 / > Avg: 299,701) > # > # There is insufficient memory for the Java Runtime Environment to continue. > # Native memory allocation (malloc) failed to allocate 2097152 bytes for > AllocateHeap > # An error report file with more information is saved as: > # /home/d/dice-gr/profiles/unix/cs/triplestore-benchmark/hs_err_pid28709.log > Child user: 70961.063 s > Child sys : 5817.787 s > Child wall: 76243.666 s > Child high-water RSS : 534037652 KiB > Recursive and acc. high-water RSS+CACHE : 585081976 KiB > {code} > The machine has a AMD EPYC 7742 64-Core Processor, 1TB RAM and 2 TB free ssd > storage on /home. So there should still be plenty of RAM have been available. > The loc folder has at the time of crash 543GB. > I also tried with -loader=parallel and -loader=phased . Same result. > edit 23/Feb/2021: > ulimit -a > {code:bash} > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 4110480 > max locked memory (kbytes, -l) 65536 > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 4110480 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > {code} > output from prlimit --pid=<PID of tdb.loader2>: > {code:bash} > RESOURCE DESCRIPTION SOFT HARD UNITS > AS address space limit unlimited unlimited bytes > CORE max core file size 0 unlimited bytes > CPU CPU time unlimited unlimited seconds > DATA max data size unlimited unlimited bytes > FSIZE max file size unlimited unlimited bytes > LOCKS max number of file locks held unlimited unlimited locks > MEMLOCK max locked-in-memory address space 67108864 67108864 bytes > MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes > NICE max nice prio allowed to raise 0 0 > NOFILE max number of open files 1048576 1048576 files > NPROC max number of processes 4110480 4110480 processes > RSS max resident set size unlimited unlimited bytes > RTPRIO max real-time priority 0 0 > RTTIME timeout for real-time tasks unlimited unlimited microsecs > SIGPENDING max number of pending signals 4110480 4110480 signals > STACK max stack size 8388608 unlimited bytes > {code} > I tried several things. Nothing worked so far. Here is what I tried: > * Setting several JVM_ARGS to avoid resource over-consumption: > {code:bash} > -Xmx16g -XX:MaxDirectMemorySize=16g -Djdk.nio.maxCachedBufferSize=17179869184 > -XX:ActiveProcessorCount=4 > {code} > * Setting limits for Metaspace and CodeCache: > {code:bash} > -XX:MaxMetaspaceSize=4g -XX:ReservedCodeCacheSize=4g > {code} > * limiting kernel shared memory to (set in /etc/sysctl.conf + reboot) > {code:bash} > kernel.shmmax=17179869185 # 16 GB > kernel.shmall=4194304 # *4096 pagesize = 16 GB > {code} > * putting jena into a dockerfile and run it with the commands below. The idea > was to limit the resources and have a default environment. I first tried with > 16g memory but it within a couple 100M triples dropping below 10k triples/s > loading speed. So I increased the reseources because I didn't have that much > time on the server. Dockerfile is provided as attachment as well as the > output. Here the output might be interesting. Because tdb2.loader did not > crash completely but simply logged an Exception and idled ever since. > {code:bash} > docker build -f FusekiWikidataLoaderDockerfile -t fuseki-wikidata_01 > empty_folder/ > dockerun --user $(id -u):$(id -g) --kernel-memory=256g --memory=256g > --cpuset-cpus=0-7 --name fuseki_wikidata_8c_J16g_R256g_K256g_phased -v > "$(pwd)"/databases:/databases -v "$(pwd)"/datasets:/datasets -v > "$(pwd)"/logs:/logs fuseki-wikidata_01:latest > {code} > error: > {code:bash} > 05:27:31 INFO loader :: Add: 3,156,000,000 Index (Batch: 57,074 / > Avg: 159,787) > 05:27:35 ERROR Indexer :: Interrupted > org.apache.jena.dboe.base.file.FileException: BlockMgrMapped.segmentAllocate: > Segment = 16313 : Offset = 136843362304 > at > org.apache.jena.dboe.base.file.BlockAccessMapped.allocSegment(BlockAccessMapped.java:218) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.file.BlockAccessMapped.getByteBuffer(BlockAccessMapped.java:150) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.file.BlockAccessMapped.allocate(BlockAccessMapped.java:80) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocate(BlockMgrFileAccess.java:51) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrBase.allocate(BlockMgrBase.java:39) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrWrapper.allocate(BlockMgrWrapper.java:41) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrFreeChain.allocate(BlockMgrFreeChain.java:44) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.page.PageBlockMgr.create(PageBlockMgr.java:53) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecordsMgr.create(BPTreeRecordsMgr.java:77) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.create(BPTreeRecords.java:236) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.insertNewPage(BPTreeRecords.java:230) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.split(BPTreeRecords.java:206) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:568) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:134) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:116) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:93) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at java.lang.Thread.run(Thread.java:834) [?:?] > [53484.193s][warning][os,thread] Attempt to deallocate stack guard pages > failed (0x00007f33244ff000-0x00007f3324503000). > 05:27:36 ERROR Indexer :: Interrupted > org.apache.jena.dboe.base.file.FileException: BlockMgrMapped.segmentAllocate: > Segment = 16459 : Offset = 138068099072 > at > org.apache.jena.dboe.base.file.BlockAccessMapped.allocSegment(BlockAccessMapped.java:218) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.file.BlockAccessMapped.getByteBuffer(BlockAccessMapped.java:150) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.file.BlockAccessMapped.allocate(BlockAccessMapped.java:80) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocate(BlockMgrFileAccess.java:51) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrBase.allocate(BlockMgrBase.java:39) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrWrapper.allocate(BlockMgrWrapper.java:41) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.block.BlockMgrFreeChain.allocate(BlockMgrFreeChain.java:44) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.base.page.PageBlockMgr.create(PageBlockMgr.java:53) > ~[jena-dboe-base-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecordsMgr.create(BPTreeRecordsMgr.java:77) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.create(BPTreeRecords.java:236) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.insertNewPage(BPTreeRecords.java:230) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeRecords.split(BPTreeRecords.java:206) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.split(BPTreeNode.java:568) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:509) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:522) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPTreeNode.insert(BPTreeNode.java:203) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPlusTree.insertAndReturnOld(BPlusTree.java:278) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.dboe.trans.bplustree.BPlusTree.insert(BPlusTree.java:271) > ~[jena-dboe-trans-data-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.store.tupletable.TupleIndexRecord.performAdd(TupleIndexRecord.java:94) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.store.tupletable.TupleIndexBase.add(TupleIndexBase.java:66) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.lambda$loadTuples$1(Indexer.java:134) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.stageIndex(Indexer.java:116) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at > org.apache.jena.tdb2.loader.main.Indexer.lambda$startBulk$0(Indexer.java:93) > ~[jena-tdb2-3.17.0.jar:3.17.0] > at java.lang.Thread.run(Thread.java:834) [?:?] > [53485.632s][warning][os,thread] Attempt to deallocate stack guard pages > failed (0x00007f33242fd000-0x00007f3324301000). > {code} > Side observation: > * Fusekis loading performance seems to be influenced by linux' cache > significantly. If there is 1TB available it had had a read rate of ~2.5GB/s > (measured with htop). With only 16GB it dropped to at max 250MB/s. -- This message was sent by Atlassian Jira (v8.3.4#803005)