I'm sorry I cannot offer any further information, as I do not really
understand Java in any meaningful way.
Just in case this provides some useful information:
The data corruption was limited to one named graph within the corrupted
dataset. The triples within that graph were present and could be
accessed when using the default union graph, but any query asking for
the graph name for these triples resulted in the same error, with this
added to front:
BindingTDB ERROR get1(?add)
where the variable name was the one used in the SPARQL query for the
data within the graph.
If graph information was not asked for, the triples returned ok.
And to repeat, this was not the dataset that was updated: that dataset
seems ok. I'm quite positive the corrupted dataset did not have any
activity going on at that moment.
One option that comes to my mind: could it be that the corruption took
place earlier? Is there something else that could explain the errors?
The database did start ok after I had freed some space on the disk, but
the errors just manifested when I tried to compact the data, and the
data in the corrupted graphs has not been used for a while.
Just as FYI:
Since the actual graph entity was somehow corrupted, the data within
that could not be deleted or edited; the remedy was to export data from
all named graphs, delete the dataset files, and import the data back;
luckily the data in the corrupted graph could be easily recreated.
Best,
Harri
On 21.5.2021 14.14, Andy Seaborne wrote:
Hi,
The JVM crash with SIGBUS looks like:
https://bugs.openjdk.java.net/browse/JDK-8168628
and see the comment 22-05-2018
"This change has been backed out of JDK 11 as it break sparse files."
which refers to:
https://bugs.openjdk.java.net/browse/JDK-8191278
fixed in version 14. Looks like a backport as well to java8 but that
might be OpenJDK8 only. Mikael is running java-8-oracle - not clear if
that has a backport.
I can't connect that to why the Fuseki nodetable becomes broken because
the transaction shouldn't happen. Even a partial commit should be
recovered (the journal is replayed on start-up if it has entries).
Andy
On 21/05/2021 09:14, Harri Kiiskinen wrote:
I seem to have similar problems as M. Pesonen in the other chain.
Summary:
Fuseki server encountered a "disk full" situation (see log and error
report below) during update leading to crash. After restart, some
parts of the database are corrupted: dump and compact fail with
NodeTableTRDF/Read exceptions (see third error log below), as well as
some queries, but not all.
The corruption has taken place in another dataset than the one that
was being update when the disk full occurred.
Logs below,
best
Harri Kiiskinen
Fuseki log for "disk full" crash:
--------------------------------------------------------------------
fuseki-server[216149]: [2021-05-20 21:37:07] Fuseki INFO [182050]
Update
fuseki-server[216149]: #
fuseki-server[216149]: # A fatal error has been detected by the Java
Runtime Environment:
fuseki-server[216149]: #
fuseki-server[216149]: # SIGBUS (0x7) at pc=0x00007f2b608b7e15,
pid=216149, tid=768713
...
--------------------------------------------------------------------
The error report /tmp/hs_err_pid216149.log:
----------------------------------------------------------------------
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGBUS (0x7) at pc=0x00007f2b608b7e15, pid=216149, tid=768713
#
# JRE version: OpenJDK Runtime Environment (11.0.11+9) (build
11.0.11+9-Ubuntu-0ubuntu2.20.04)
# Java VM: OpenJDK 64-Bit Server VM (11.0.11+9-Ubuntu-0ubuntu2.20.04,
mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# v ~StubRoutines::jint_disjoint_arraycopy
#
# Core dump will be written. Default location: Core dumps may be
processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or
dumping to //core.216149)
#
# If you would like to submit a bug report, please visit:
# https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
#
--------------- S U M M A R Y ------------
Command Line: -Xmx2G org.apache.jena.fuseki.cmd.FusekiCmd
--jetty-config=/etc/fuseki/fuseki-jetty-https.xml
Host: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz, 2 cores, 15G, Ubuntu
20.04.2 LTS
Time: Thu May 20 21:37:07 2021 EEST elapsed time: 1517007.667502
seconds (17d 13h 23m 27s)
...
Error message when running tdb2.tdbcompact
-----------------------------------------------------------------
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
at
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87)
at
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103)
at
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52)
at
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:206)
at
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:131)
...
at tdb2.tdbcompact.main(tdbcompact.java:28)
Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized
type 0
at
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:144)
at
org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60)
at
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:433)
at
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
at
org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
at org.apache.thrift.TUnion.read(TUnion.java:138)
at
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82)
... 27 more
-----------------------------------------------------------------------
--
Tutkijatohtori / post-doctoral researcher
Viral Culture in the Early Nineteenth-Century Europe (ViCE)
Movie Making Finland: Finnish fiction films as audiovisual big data,
1907–2017 (MoMaF)
Turun yliopisto / University of Turku