Hi,

We've been evaluating an using Jena for about 1,5 years now, but are
recently running into a perplexing issue. In a lot of different scenarios,
ways of using Jena, we are getting the exceptions like the one below:

org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read
at
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:102)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:208)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:133)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.tdb2.solver.BindingTDB.get1(BindingTDB.java:126)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.sparql.engine.binding.BindingBase.get(BindingBase.java:111)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.core.Var.lookup(Var.java:103)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.core.Var.lookup(Var.java:98)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.core.Substitute.substitute(Substitute.java:133)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.core.Substitute.substitute(Substitute.java:119)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:149)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.modify.TemplateLib$2.apply(TemplateLib.java:108)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.sparql.modify.TemplateLib$2.apply(TemplateLib.java:98)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.ext.com.google.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1400)
~[fuseki-server.jar:4.8.0]
at java.util.Iterator.forEachRemaining(Unknown Source) ~[?:?]
at
org.apache.jena.sparql.exec.QueryExecDataset.constructDataset(QueryExecDataset.java:228)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.sparql.exec.QueryExec.constructDataset(QueryExec.java:166)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.sparql.exec.QueryExecutionAdapter.execConstructDataset(QueryExecutionAdapter.java:172)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:391)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:279)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeWithParameter(SPARQLQueryProcessor.java:224)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:199)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:58)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execGet(SPARQLQueryProcessor.java:80)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.ActionProcessor.process(ActionProcessor.java:33)
~[fuseki-server.jar:4.8.0]
at org.apache.jena.fuseki.servlets.ActionBase.process(ActionBase.java:54)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.ActionExecLib.execActionSub(ActionExecLib.java:125)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.fuseki.servlets.ActionExecLib.execAction(ActionExecLib.java:99)
~[fuseki-server.jar:4.8.0]
...
at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1080)
[fuseki-server.jar:4.8.0]
at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized type
0
at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:140)
~[fuseki-server.jar:4.8.0]
at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:53)
~[fuseki-server.jar:4.8.0]
at
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:432)
~[fuseki-server.jar:4.8.0]
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:238)
~[fuseki-server.jar:4.8.0]
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:227)
~[fuseki-server.jar:4.8.0]
at org.apache.thrift.TUnion.read(TUnion.java:145) ~[fuseki-server.jar:4.8.0]
at
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82)
~[fuseki-server.jar:4.8.0]
... 92 more

The different scenarios where it has happened are:

  - LOADing data into a dataset
  - compacting a dataset
  - querying a dataset

In all those case we've run into trouble and get an exception that
mentions *org.apache.jena.tdb2.TDBException:
NodeTableTRDF/Read* and *org.apache.thrift.protocol.TProtocolException:
Unrecognized type 0*.

What can cause this? This looks kinda similar to this mailing list
question, https://www.mail-archive.com/users@jena.apache.org/msg20409.html,
where it seems data corruption is mentioned that potentially isn't
recoverable?

The first time I encountered this issue was while doing a bunch of
sequential LOAD commands to prepare a large dataset for load testing. I
used files of around 50mb (started off with bigger ones) and after about 20
to 25 LOADs it would get this error (also the completion time of a LOAD
would go up and up). So for this scenario I was running locally (Jena
Fuseki running in docker/Rancher) and only running the LOADs and not much
else except for a SELECT here and there (via the Fuseki UI) to check that
performance while LOADing. Is there a way that that could cause data
corruption and the exception we're seeing?

regards,

Jan Eerdekens

Reply via email to