Hi, We've been evaluating an using Jena for about 1,5 years now, but are recently running into a perplexing issue. In a lot of different scenarios, ways of using Jena, we are getting the exceptions like the one below:
org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read at org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:87) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:102) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:208) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:133) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.solver.BindingTDB.get1(BindingTDB.java:126) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.engine.binding.BindingBase.get(BindingBase.java:111) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.core.Var.lookup(Var.java:103) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.core.Var.lookup(Var.java:98) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.core.Substitute.substitute(Substitute.java:133) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.core.Substitute.substitute(Substitute.java:119) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:149) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.modify.TemplateLib$2.apply(TemplateLib.java:108) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.modify.TemplateLib$2.apply(TemplateLib.java:98) ~[fuseki-server.jar:4.8.0] at org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417) ~[fuseki-server.jar:4.8.0] at org.apache.jena.ext.com.google.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1400) ~[fuseki-server.jar:4.8.0] at java.util.Iterator.forEachRemaining(Unknown Source) ~[?:?] at org.apache.jena.sparql.exec.QueryExecDataset.constructDataset(QueryExecDataset.java:228) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.exec.QueryExec.constructDataset(QueryExec.java:166) ~[fuseki-server.jar:4.8.0] at org.apache.jena.sparql.exec.QueryExecutionAdapter.execConstructDataset(QueryExecutionAdapter.java:172) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:391) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:279) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeWithParameter(SPARQLQueryProcessor.java:224) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:199) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:58) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execGet(SPARQLQueryProcessor.java:80) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.ActionProcessor.process(ActionProcessor.java:33) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.ActionBase.process(ActionBase.java:54) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.ActionExecLib.execActionSub(ActionExecLib.java:125) ~[fuseki-server.jar:4.8.0] at org.apache.jena.fuseki.servlets.ActionExecLib.execAction(ActionExecLib.java:99) ~[fuseki-server.jar:4.8.0] ... at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1080) [fuseki-server.jar:4.8.0] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized type 0 at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:140) ~[fuseki-server.jar:4.8.0] at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:53) ~[fuseki-server.jar:4.8.0] at org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:432) ~[fuseki-server.jar:4.8.0] at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:238) ~[fuseki-server.jar:4.8.0] at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:227) ~[fuseki-server.jar:4.8.0] at org.apache.thrift.TUnion.read(TUnion.java:145) ~[fuseki-server.jar:4.8.0] at org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82) ~[fuseki-server.jar:4.8.0] ... 92 more The different scenarios where it has happened are: - LOADing data into a dataset - compacting a dataset - querying a dataset In all those case we've run into trouble and get an exception that mentions *org.apache.jena.tdb2.TDBException: NodeTableTRDF/Read* and *org.apache.thrift.protocol.TProtocolException: Unrecognized type 0*. What can cause this? This looks kinda similar to this mailing list question, https://www.mail-archive.com/users@jena.apache.org/msg20409.html, where it seems data corruption is mentioned that potentially isn't recoverable? The first time I encountered this issue was while doing a bunch of sequential LOAD commands to prepare a large dataset for load testing. I used files of around 50mb (started off with bigger ones) and after about 20 to 25 LOADs it would get this error (also the completion time of a LOAD would go up and up). So for this scenario I was running locally (Jena Fuseki running in docker/Rancher) and only running the LOADs and not much else except for a SELECT here and there (via the Fuseki UI) to check that performance while LOADing. Is there a way that that could cause data corruption and the exception we're seeing? regards, Jan Eerdekens