Ooh! Those numbers are awful. Per your point 2, it does create a new tree per add/remove. And PCollections’ bulk operations are just loops over the single-element operations, so trying to accumulate data and use a single operation will create the same number of trees. Unfortunately, PCollections does not have something like Clojure’s transient operations [*], where under carefully-controlled conditions a normally persistent structure can be mutated in place for celerity of operation. I have no commitment to PCollections, and I can switch and see what happens with Clojure and transiency. But I should first go back over the code with a fine-toothed comb and make sure that there isn’t a plain old mistake of some kind.
As far as the indexes, I’m not quite sure what you mean by “triples+quads”. Do you mean a single map from graph name to three triple-covering indexes? Something like Map<Node, TripleIndex>, with TripleIndex having within it three covering indexes for triples in the way that current HexIndex has within it six covering indexes for quads? --- A. Soroka The University of Virginia Library [*] http://clojure.org/transients > On Sep 26, 2015, at 6:42 AM, Andy Seaborne <[email protected]> wrote: > > Some thoughts: > > 1/ If it were a triples+quads design (TripleTable, QuadTable) , not just > quads, there would be 3 indexes not 6 for triples so 2x faster. > > 2/ As autocommit and txn forms are nearly the same, I guess that every > add(Quad) is causing a new pcollections tree for each index. > > I don't know pcollections but is it possible to use it so a independent tree > is created only at begin(W). i.e. copy-to-root does not happen on stuff > updated already touched after begin(W). > > Andy
