On 13/04/18 20:43, Greg Albiston wrote:
Hi Andy,
I think I tried editing the BAT file for the same result but will have another
go. I'm pretty certain that there are rdf:types already in the data.
I'm away next week but when I get back I'll put together a patch for the TDB
API idea.
It would probably be best if there was a single "stats" method that works out
itself if the Dataset is TDB1 or TDB2.
Less to maintain and easier to use.
Do the TDBFactory.isBackedByTDB and TDB2Factory.isBackedByTDB methods
distinguish between a TDB1 and TDB2 Dataset?
Yes.
The method descriptions only say they confirm if it is TDB backed rather than
just the relevant version.
Is there a logical class the "stats" method should go on? Something like a
TDBUtils? There is DatasetUtils but it is a slightly different concept.
Tricky because of dependencies.
jena-tdb2 is intentionally does not depend on jena-tdb so as to avoid
accidental cross over.
If you want a universal one, the binding will need to be dynamic, using
the system initialization mechanism to load handlers into a central
repository, maybe in org.apache.jena.dboe.sys.
Finally, once the "stats.opt" file has been created is there a hook to have it
considered by the Dataset/TDB? The Dataset will have to be opened to be examined. Can a
refresh be forced once the Dataset is open?
Currently, both TBDs look for a file stats.opt in their storage areas
and it is passed to teh database object in the DatasetGraphTDB
constructor (the ReorderTransformation). Note - it would be hard to
safely change while running due to potential outstanding transactions.
Putting it in place on disk and resetting the database might be preferrable.
(TDB1: StoreConnection.expel, TDB2: TDBInternal.expel)
I'll look around to see what there is but any pointers would be really helpful.
Thanks,
Greg
-----Original Message-----
From: Andy Seaborne <[email protected]>
Sent: 13 April 2018 11:24
To: [email protected]
Subject: Re: TDB2 tdbstats Script Error
Greg,
It's only the script that is missing. If you copy one of the other others and
rename the command called it will work (it's line 89in the
tdb2.tdb* scripts - they are macro generated from common source).
I did find one problem - the 3.7.0 code assumes a node for rdf:type exists in the code
table. That's fixed by JENA-1520 / PR#396. The workaround is to add and delete a triple
with rdf:type in it if your data does not alrady include a use of rdf:type. Even
"rdf:type rdf:type rdf:type" will do.
I not sure how much use of stats is enabled for query execution in TDB2.
On 12/04/18 11:28, Greg Albiston wrote:
Hello,
I've created a new TDB2 dataset using Jena 3.7.0. I'm now trying to run the
tdbstats script on the command line (in Windows using the tdbstats.bat script).
When running this I get an exception "Unable to check TDB lock owner as the lock
file contains invalid data" (stack trace below signature).
This is a fresh TDB dataset with multiple named graphs, which I've deleted and
re-created but still get the same error.
You are using TDB1 tdbstats on a TDB2 database?
They use different and incompatible locking strategies. TDB2 uses OS file
locking which might (we have yet to see) be more applicable for VM environments
with shared filesystems.
There's no tb2.tdbstats script in the jena/bin directory despite references in
the documentation that TDB1 and TDB2 are incompatible and there being other
tdb2.* scripts.
Is the tdbstats script for TDB2 not implemented?
Above.
Also, is there a TDB API method to run tdbstats?
Not formally. If you have some code, could yuo contrinute it?
The code is in package org.apache.jena.tdb2.solver.stats
The command (in jena-cmds) tdb2.tdbstats has the way to do it.
This is useful in my use case where a lot of TDB datasets are being created.
I've managed to piece together a method from the source code but would like to
validate it against the released tdbstats script.
It may also be useful for others to have access to this kind of method from the
TDB API as setting up the Jena scripts is quite convoluted on Windows.
Suggestions and contributions welcome! (esp as many of the main contributros
are not Windows users)
Scripts are created in the module "apache-jena".
Andy
Thanks,
Greg
org.apache.jena.tdb.base.file.FileException: Unable to check TDB lock owner as
the lock file contains invalid data
at
org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java:111)
at
org.apache.jena.tdb.base.file.LocationLock.canObtain(LocationLock.java:130)
at
org.apache.jena.tdb.StoreConnection._makeAndCache(StoreConnection.java:259)
at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:231)
at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:237)
at
org.apache.jena.tdb.transaction.DatasetGraphTransaction.<init>(DatasetGraphTransaction.java:73)
at org.apache.jena.tdb.sys.TDBMaker._create(TDBMaker.java:55)
at
org.apache.jena.tdb.sys.TDBMaker.createDatasetGraphTransaction(TDBMaker.java:42)
at
org.apache.jena.tdb.TDBFactory._createDatasetGraph(TDBFactory.java:89)
at
org.apache.jena.tdb.TDBFactory.createDatasetGraph(TDBFactory.java:71)
at org.apache.jena.tdb.TDBFactory.createDataset(TDBFactory.java:55)
at tdb.cmdline.ModTDBDataset.createDataset(ModTDBDataset.java:103)
at arq.cmdline.ModDataset.getDataset(ModDataset.java:36)
at tdb.cmdline.CmdTDB.getDataset(CmdTDB.java:80)
at tdb.cmdline.CmdTDB.getDatasetGraph(CmdTDB.java:71)
at tdb.cmdline.CmdTDB.getDatasetGraphTDB(CmdTDB.java:75)
at tdb.tdbstats.exec(tdbstats.java:98)
at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
at tdb.tdbstats.main(tdbstats.java:44)
Caused by: java.lang.NumberFormatException: For input string: "15304 "
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at
org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java:106)
... 20 more