Re: Timing tests for jena-624

A. Soroka Sat, 26 Sep 2015 04:08:21 -0700

Ooh! Those numbers are awful. Per your point 2, it does create a new tree per 
add/remove. And PCollections’ bulk operations are just loops over the 
single-element operations, so trying to accumulate data and use a single 
operation will create the same number of trees. Unfortunately, PCollections 
does not have something like Clojure’s transient operations [*], where under 
carefully-controlled conditions a normally persistent structure can be mutated 
in place for celerity of operation. I have no commitment to PCollections, and I 
can switch and see what happens with Clojure and transiency. But I should first 
go back over the code with a fine-toothed comb and make sure that there isn’t a 
plain old mistake of some kind.

As far as the indexes, I’m not quite sure what you mean by “triples+quads”. Do 
you mean a single map from graph name to  three triple-covering indexes? 
Something like Map<Node, TripleIndex>, with TripleIndex having within it three 
covering indexes for triples in the way that current HexIndex has within it six 
covering indexes for quads?

---
A. Soroka
The University of Virginia Library

[*] http://clojure.org/transients

> On Sep 26, 2015, at 6:42 AM, Andy Seaborne <[email protected]> wrote:
> 
> Some thoughts:
> 
> 1/ If it were a triples+quads design (TripleTable, QuadTable) , not just 
> quads, there would be 3 indexes not 6 for triples so 2x faster.
> 
> 2/ As autocommit and txn forms are nearly the same, I guess that every 
> add(Quad) is causing a new pcollections tree for each index.
> 
> I don't know pcollections but is it possible to use it so a independent tree 
> is created only at begin(W). i.e. copy-to-root does not happen on stuff 
> updated already touched after begin(W).
> 
>       Andy

Re: Timing tests for jena-624

Reply via email to