Hi Andy,

many thanks.

On 2021-09-07 9:11 pm, Andy Seaborne wrote:
Hi Holger,

I let TQ know this was happening when the JIRA was in progress.
Quite possibly. Too many things happen in parallel, sorry if I missed that one and didn't pay enough attention.

The errors you see mean internal transaction state has passed across transaction boundaries. DatasetGraphTransaction is the object class that carries across transactions in TDB1. DatasetGraphTDB has a lifetime of one transaction (and it is also the storage).

createPrefixMapping creates a cached prefix map but the code has "getDatasetGraphTDB" which is the transaction specific object.

Try overriding getPrefixMapping() with that code, not createPrefixMapping() as a quick solution to test the rest of your architecture.
Yes, after a quick test, this seems to work better. I need to do more testing tomorrow and check performance impact of creating this repeatedly.

TDB1 uses PrefixMapTDB1/PrefixMapProxy to become switchable (a feature that is native to TDB2).

Beware about TriG based backups. They didn't separate prefixes per graph before and still don't (with a different behaviour).  There isn't a standard format for a setup that you describe.

Yes, we noticed the same issue. The work-around for us might be to add the prefix declarations as (temp) triples into each graph, e.g. using the sh:prefix vocabulary. But we are also adding alternative ways of doing backups, esp through Git integration where people use individual Turtle files. That's better anyway, as our Turtle writer preserves order of triples.

Having said this, I honestly don't think the limitations of one serialization should be enough to motivate such a drastic change to how prefixes are managed in TDB. For example it means that if someone changes the prefixes in one graph of the dataset, then she also changes the prefixes for all others, even for graphs that are not supposed to be writable for her. And then what would happen if two graphs are loaded and added from turtle files where each declares "ex" prefix? All this sounds very fragile and makes the use of such shared-prefixmapping datasets rather limiting - the old design was working just fine.

Anyway, as long as we can work around this...

Slightly related to this, I noticed that PrefixMappingAdapter.uriToPrefix is very inefficient, doing a reverse O(n) look-up.


    Andy

Would it be possible to have browsable stack traces next time? Something to drop into the Eclipse stack viewer.

Will try, yes.

Thanks again,
Holger



On 07/09/2021 03:44, Holger Knublauch wrote:
Hi Andy,

with our current migration to Jena 4.1 we noticed this change

     https://issues.apache.org/jira/browse/JENA-2006

is affecting the way that we use TDB (in the "shared TDB aka XDB") mode.

Previously, all named graphs were storing their own prefixes, yet now there seems to be only one prefix mapping for all named graphs, which breaks the assumptions that we made on the Graph API until now.

We need to find a way to restore the old behavior. I drilled into the code for a few hours but feel stuck due to my limited understanding of how these pieces work together. So far I have introduced this class below to override the createPrefixMapping method:

public class GraphXDB extends GraphTxnTDB {

     public GraphXDB(DatasetGraphTransaction dataset, Node graphName) {
         super(dataset, graphName);
     }

     @Override
     protected PrefixMapping createPrefixMapping() {
         DatasetPrefixesTDB pm = getDatasetGraphTDB().getStoragePrefixes();          GraphPrefixesProjection projection = new GraphPrefixesProjection(getGraphName().toString(), pm);
         return Prefixes.adapt(projection);
     }
}

and this looks OK in read mode but fails on writes with errors such as:

ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] Not active: 20
ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] **** Not active: 20
ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] Not active: 20
ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] **** Not active: 20
ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] Not active: 20
ERROR o.a.j.t.t.BlockMgrJournal [qtp975404820-67] **** Not active: 20

Stack:

Thread [qtp975404820-101] (Suspended (breakpoint at line 304 in BlockMgrJournal))
     owns: Transaction  (id=366)
     BlockMgrJournal.checkActive() line: 304
     BlockMgrJournal.commitPrepare(Transaction) line: 91
     Transaction.lambda$prepare$0(TransactionLifecycle) line: 289
     74738525.accept(Object) line: not available
     ArrayList<E>.forEach(Consumer<? super E>) line: 1541
Transaction.forAllComponents(Consumer<TransactionLifecycle>) line: 283
     Transaction.prepare() line: 289
     Transaction.writerPrepareCommit() line: 165
     Transaction.commit() line: 120
     DatasetGraphTxn.commit() line: 61
     DatasetGraphTransaction.commit() line: 216
     DatasetGraphTxnTracking(DatasetGraphWrapper).commit() line: 276
     DatasetGraphTxnTracking.commit() line: 41
     TxnX.exec(T, TxnType, Runnable) line: 164
     TxnX.executeWrite(T, Runnable) line: 204
     PrefixMappingTxn.removeNsPrefix(String) line: 41
     ModelCom.removeNsPrefix(String) line: 972

Would you have any hints on how to proceed and whether this is on the right track? In the worst case, I guess we could switch to a completely different storage mechanism for those per-graph prefixes and bypass TDB.

Thank you
Holger


Reply via email to