Hi Andy

On 10/07/2017 12:29, Andy Seaborne wrote:
On 07/07/17 20:29, Daan Reid wrote:
Hi all,

After upgrading to Jena 3.2 recently, we encountered the issue referenced here:
https://issues.apache.org/jira/browse/JENA-1370

Daan - what did you upgrade from?

(You also mention TDB there)

We were previously on Jena 2.12.1. One of the things we're working on is an event source layer on top of a tdb data store (https://github.com/drugis/jena-es).

In short, a graph Delta, when adding a triple with a double value property that is semantically the same but lexically different to a triple already in the graph causes what is in my opinion incorrect behaviour: The triple is placed in the deletions list and not in the additions list, in essence removing it from the graph even though it should still be in there.

So for example if I take a Delta with graph base containing `:s :p -1.700000e+00 .` and then clear it and add the triple `:s :p -1.7E0 .`, the resulting Delta will have a set of deletions containing the original triple, and an empty list of additions.

As far as I can tell this is because Delta uses .contains() to check its additions and deletions, and for the volatile GraphMem we use, the literals are checked for lexical instead of semantic equality, and this causes inconsistencies.

I would appreciate any and all help with this.

Yes - it looks like that is the problem. It is compounded by the fact that it uses GraphMem internally for additions and deletions which impacts Delta.find() so even if the input graph is a term-equity-graph then find() is still doing some value matching.

I would have thought that the behaviour should be term-based throughout but what use cases are there for using Delta?

A workaround is to use a term-based graphs

e.g. DatasetGraphFactory.createTxnMem().getDefaultGraph();

which needs taking copy of Delta and replacing the additions and deletions graphs as well as the base graph.

Long term, Delta could add a value-to-term wrapper to work on term equality only.

Our use case is to use the Delta to calculate the differences between graphs for events so we can replay additions and deletions without storing all the unchanged triples for each graph change event.

Since we've got control over the database and the message converter we have since made a workaround by canonicalising the DB with some queries, and from now will use the RDFParser's forced canonical system (thanks for pointing that out!) on incoming messages to prevent this issue from affecting us in future.

A Delta wrapper that works by value instead of lexical equality sounds like it could be a great general solution, because the current situation's inconsistency is a bit of a gotcha.

For reference to those interested, the inconsistency is located in the fact that `GraphTripleStoreBase.contains()` checks by values, but `GraphTripleStoreBase.delete()` defers to `NodeToTriplesMapMem.remove()` which compares lexically. So in the `Delta.addTriple()`, the base graph does contain the (semantically equal) triple and hence it is not added to the additions, but the `deletions.delete()` checks by lexical value and hence does not delete it.

Regards, and thanks for the help,

Daan Reid
Drugis project -- https://drugis.org

Reply via email to