On 26/09/15 12:07, A. Soroka wrote:
Ooh! Those numbers are awful.
Early days. The general purpose dataset has no features. And, of
course, a concurrent read is completely blocked - that's a major issue
for some usages.
Access performance, having update not block query, in a very reliable
implementation is a valuable thing to have. And if it is described as a
"complete temporal database", it is all a good thing. Marketing.
The storage implementation is now a self-contained thing to look at.
... seems there is no shortage of options ... google quickly got me:
http://stackoverflow.com/questions/8575723/whats-a-good-persistent-collections-framework-for-use-in-java
and there are more. Various data structures I have not heard of before.
Per your point 2, it does create a new
tree per add/remove. And PCollections’ bulk operations are just loops
over the single-element operations, so trying to accumulate data and
use a single operation will create the same number of trees.
Unfortunately, PCollections does not have something like Clojure’s
transient operations [*], where under carefully-controlled conditions
a normally persistent structure can be mutated in place for celerity
of operation. I have no commitment to PCollections, and I can switch
and see what happens with Clojure and transiency. But I should first
go back over the code with a fine-toothed comb and make sure that
there isn’t a plain old mistake of some kind.
As far as the indexes, I’m not quite sure what you mean by
“triples+quads”. Do you mean a single map from graph name to three
triple-covering indexes? Something like Map<Node, TripleIndex>, with
TripleIndex having within it three covering indexes for triples in
the way that current HexIndex has within it six covering indexes for
quads?
That's one way - I meant using the supporting framework in
DatasetGraphTriplesQuads so
DatasetGraphQuads => DatasetGraphTriplesQuads
The default graph is handled separately from named graphs.
TDB uses this - there is a triple table (dft: 3 index) and a quads table
(dft: 6 index)
Andy
--- A. Soroka The University of Virginia Library
[*] http://clojure.org/transients
On Sep 26, 2015, at 6:42 AM, Andy Seaborne <[email protected]>
wrote:
Some thoughts:
1/ If it were a triples+quads design (TripleTable, QuadTable) , not
just quads, there would be 3 indexes not 6 for triples so 2x
faster.
2/ As autocommit and txn forms are nearly the same, I guess that
every add(Quad) is causing a new pcollections tree for each index.
I don't know pcollections but is it possible to use it so a
independent tree is created only at begin(W). i.e. copy-to-root
does not happen on stuff updated already touched after begin(W).
Andy