On 13/04/18 20:43, Greg Albiston wrote:
Hi Andy,

I think I tried editing the BAT file for the same result but will have another 
go. I'm pretty certain that there are rdf:types already in the data.

I'm away next week but when I get back I'll put together a patch for the TDB 
API idea.
It would probably be best if there was a single "stats" method that works out 
itself if the Dataset is TDB1 or TDB2.
Less to maintain and easier to use.

Do the TDBFactory.isBackedByTDB and TDB2Factory.isBackedByTDB methods 
distinguish between a TDB1 and TDB2 Dataset?


The method descriptions only say they confirm if it is TDB backed rather than 
just the relevant version.

Is there a logical class the "stats" method should go on? Something like a 
TDBUtils? There is DatasetUtils but it is a slightly different concept.

Tricky because of dependencies.

jena-tdb2 is intentionally does not depend on jena-tdb so as to avoid accidental cross over.

If you want a universal one, the binding will need to be dynamic, using the system initialization mechanism to load handlers into a central repository, maybe in org.apache.jena.dboe.sys.

Finally, once the "stats.opt" file has been created is there a hook to have it 
considered by the Dataset/TDB? The Dataset will have to be opened to be examined. Can a 
refresh be forced once the Dataset is open?

Currently, both TBDs look for a file stats.opt in their storage areas and it is passed to teh database object in the DatasetGraphTDB constructor (the ReorderTransformation). Note - it would be hard to safely change while running due to potential outstanding transactions.

Putting it in place on disk and resetting the database might be preferrable.
(TDB1: StoreConnection.expel, TDB2: TDBInternal.expel)

I'll look around to see what there is but any pointers would be really helpful.



-----Original Message-----
From: Andy Seaborne <a...@apache.org>
Sent: 13 April 2018 11:24
To: users@jena.apache.org
Subject: Re: TDB2 tdbstats Script Error


It's only the script that is missing.  If you copy one of the other others and 
rename the command called it will work (it's line 89in the
tdb2.tdb* scripts - they are macro generated from common source).

I did find one problem - the 3.7.0 code assumes a node for rdf:type exists in the code 
table. That's fixed by JENA-1520 / PR#396. The workaround is to add and delete a triple 
with rdf:type in it if your data does not alrady include a use of rdf:type. Even 
"rdf:type rdf:type rdf:type" will do.

I not sure how much use of stats is enabled for query execution in TDB2.

On 12/04/18 11:28, Greg Albiston wrote:

I've created a new TDB2 dataset using Jena 3.7.0. I'm now trying to run the 
tdbstats script on the command line (in Windows using the tdbstats.bat script).
When running this I get an exception "Unable to check TDB lock owner as the lock 
file contains invalid data" (stack trace below signature).
This is a fresh TDB dataset with multiple named graphs, which I've deleted and 
re-created but still get the same error.

You are using TDB1 tdbstats on a TDB2 database?

They use different and incompatible locking strategies. TDB2 uses OS file 
locking which might (we have yet to see) be more applicable for VM environments 
with shared filesystems.

There's no tb2.tdbstats script in the jena/bin directory despite references in 
the documentation that TDB1 and TDB2 are incompatible and there being other 
tdb2.* scripts.
Is the tdbstats script for TDB2 not implemented?


Also, is there a TDB API method to run tdbstats?

Not formally. If you have some code, could yuo contrinute it?

The code is in package org.apache.jena.tdb2.solver.stats

The command (in jena-cmds) tdb2.tdbstats has the way to do it.

This is useful in my use case where a lot of TDB datasets are being created.
I've managed to piece together a method from the source code but would like to 
validate it against the released tdbstats script.
It may also be useful for others to have access to this kind of method from the 
TDB API as setting up the Jena scripts is quite convoluted on Windows.

Suggestions and contributions welcome! (esp as many of the main contributros 
are not Windows users)

Scripts are created in the module "apache-jena".




org.apache.jena.tdb.base.file.FileException: Unable to check TDB lock owner as 
the lock file contains invalid data
          at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:231)
          at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:237)
          at org.apache.jena.tdb.sys.TDBMaker._create(TDBMaker.java:55)
          at org.apache.jena.tdb.TDBFactory.createDataset(TDBFactory.java:55)
          at tdb.cmdline.ModTDBDataset.createDataset(ModTDBDataset.java:103)
          at arq.cmdline.ModDataset.getDataset(ModDataset.java:36)
          at tdb.cmdline.CmdTDB.getDataset(CmdTDB.java:80)
          at tdb.cmdline.CmdTDB.getDatasetGraph(CmdTDB.java:71)
          at tdb.cmdline.CmdTDB.getDatasetGraphTDB(CmdTDB.java:75)
          at tdb.tdbstats.exec(tdbstats.java:98)
          at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
          at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
          at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
          at tdb.tdbstats.main(tdbstats.java:44)
Caused by: java.lang.NumberFormatException: For input string: "15304 "
          at java.lang.NumberFormatException.forInputString(Unknown Source)
          at java.lang.Integer.parseInt(Unknown Source)
          at java.lang.Integer.parseInt(Unknown Source)
          ... 20 more

Reply via email to