On 29/07/12 09:13, Laurent Pellegrino wrote:
Hello all,
I have a transactional TDB instance which is running and receiving
frequently new quadruples to insert. I would like to count
periodically the number of named graphs and quadruples contained by
this instance. The simplest solution would be certainly to redeploy a
new version that maintain the number of quadruples received in an
instance field but I want to avoid to deploy a new version. Thus, my
idea was to use the tdbstats command but it seems that it can not
provide stats on all the named graphs. Finally I have written a small
application that query all the quadruples and iterate over them [1].
However, when I run it on a dataset which is already used by an
another instance of Jena TDB I get the following exception:
Exception in thread "main" java.lang.UnsupportedOperationException: Quad:
subject cannot be null
at com.hp.hpl.jena.sparql.core.Quad.<init>(Quad.java:60)
at com.hp.hpl.jena.tdb.lib.TupleLib.quad(TupleLib.java:164)
at com.hp.hpl.jena.tdb.lib.TupleLib.quad(TupleLib.java:155)
at com.hp.hpl.jena.tdb.lib.TupleLib.access$100(TupleLib.java:45)
at com.hp.hpl.jena.tdb.lib.TupleLib$4.convert(TupleLib.java:89)
at com.hp.hpl.jena.tdb.lib.TupleLib$4.convert(TupleLib.java:85)
at org.openjena.atlas.iterator.Iter$4.next(Iter.java:301)
at org.bitbucket.lp.tdbstats.Main.main(Main.java:35)
I have read in the documentation that multiple applications, running
in multiple JVMs, using the same file databases is not supported. Is
it true even for a second JVM which is read-only? Does the exception
is due to concurrent access from different JVMs?
Yes - and more so with transactions.
Execute the following SPARQL command (separate thread, or over HTTP via
Fuseki):
SELECT
(Count(DISTINCT ?g) AS ?numGraphs)
(Count(*) AS ?numQuads)
WHERE
{ GRAPH ?g { ?s ?p ?o } }
this will do all the work in a single operation.
Fuseki, and SPARQL over HTTP, are effectively JDBC for SPARQL databases.
> Does the exception
> is due to concurrent access from different JVMs?
it looks like an access occurred or a transaction was fully flushed to
the database while your code was running - you seem to be seeing an
internal inconsistency whereby the index says something exists but the
node table has no record of it yet. These are different caches so can
get out of step on disk but are correct looked at from within the
controlling JVM.
Andy
[1] http://goo.gl/e7o1y
Kind Regards,
Laurent