Hi Andy
On 10/07/2017 12:29, Andy Seaborne wrote:
On 07/07/17 20:29, Daan Reid wrote:
Hi all,
After upgrading to Jena 3.2 recently, we encountered the issue
referenced here:
https://issues.apache.org/jira/browse/JENA-1370
Daan - what did you upgrade from?
(You also mention TDB there)
We were previously on Jena 2.12.1. One of the things we're working on is
an event source layer on top of a tdb data store
(https://github.com/drugis/jena-es).
In short, a graph Delta, when adding a triple with a double value
property that is semantically the same but lexically different to a
triple already in the graph causes what is in my opinion incorrect
behaviour: The triple is placed in the deletions list and not in the
additions list, in essence removing it from the graph even though it
should still be in there.
So for example if I take a Delta with graph base containing `:s :p
-1.700000e+00 .`
and then clear it and add the triple `:s :p -1.7E0 .`, the resulting
Delta will have a set of deletions containing the original triple, and
an empty list of additions.
As far as I can tell this is because Delta uses .contains() to check
its additions and deletions, and for the volatile GraphMem we use, the
literals are checked for lexical instead of semantic equality, and
this causes inconsistencies.
I would appreciate any and all help with this.
Yes - it looks like that is the problem. It is compounded by the fact
that it uses GraphMem internally for additions and deletions which
impacts Delta.find() so even if the input graph is a term-equity-graph
then find() is still doing some value matching.
I would have thought that the behaviour should be term-based throughout
but what use cases are there for using Delta?
A workaround is to use a term-based graphs
e.g. DatasetGraphFactory.createTxnMem().getDefaultGraph();
which needs taking copy of Delta and replacing the additions and
deletions graphs as well as the base graph.
Long term, Delta could add a value-to-term wrapper to work on term
equality only.
Our use case is to use the Delta to calculate the differences between
graphs for events so we can replay additions and deletions without
storing all the unchanged triples for each graph change event.
Since we've got control over the database and the message converter we
have since made a workaround by canonicalising the DB with some queries,
and from now will use the RDFParser's forced canonical system (thanks
for pointing that out!) on incoming messages to prevent this issue from
affecting us in future.
A Delta wrapper that works by value instead of lexical equality sounds
like it could be a great general solution, because the current
situation's inconsistency is a bit of a gotcha.
For reference to those interested, the inconsistency is located in the
fact that `GraphTripleStoreBase.contains()` checks by values, but
`GraphTripleStoreBase.delete()` defers to `NodeToTriplesMapMem.remove()`
which compares lexically. So in the `Delta.addTriple()`, the base graph
does contain the (semantically equal) triple and hence it is not added
to the additions, but the `deletions.delete()` checks by lexical value
and hence does not delete it.
Regards, and thanks for the help,
Daan Reid
Drugis project -- https://drugis.org