Hi all,

I have been looking into the PersistentVector implementation, and I came up 
with a simple improvement relevant for small vectors, of between 33 and 64 
elements. For these sizes I found an "assoc" loop to be between 29% and 39% 
faster and a "get" loop between 4% and 8% faster. The memory consumption is 
also significantly better. For smaller or larger sizes everything stays the 
same.

The change is simple to explain.  A PersistentVector keeps a 32-ary 
array-based trie and a "tail" array. The tail holds up to the last 32 
elements; when it fills up it is moved into the trie. With sizes up to 32 
(inclusive) only the tail is used. Currently, when the 33rd element is 
inserted, the tail is moved to the trie and a new 1 element tail is built. 
The thing is that in the current implementation, the tail (32 elements) is 
put as a first child below a root node, implying that the minimum trie in 
use has two levels (shift = 5 in the code). This means that any access to 
the first 32 elements means indexing 2 arrays and an assoc means cloning 
two 32 element arrays.

The observation leading to the improvement is that in this case (from 33 to 
64 elements in the vector) only 32 elements are being stored in the trie 
and a single level trie with only the root node (storing directly the 
former tail) is enough (with shift=0 in the code). The result is one less 
indirection in gets, one less array cloning in assoc and one less 32 
element array of memory consumption (which for these small vector sizes is 
significant in relative terms).

I have coded the improvement and rather than needing very special treatment 
it fitted well with current scheme (as in abstract it is not a special 
case), with some small changes which even simplified the code a bit. The 
code seems to be working (at least it passes the tests building clojure). 
Of course it is tricky anyway and needs careful checking. (And there might 
be things I am not aware of, like that I had to also change 
LazilyPersistentVector.)

I am not sure what is the proper procedure for submitting patches. Shall I 
send it someone, put it somewhere? (it only involves PersistentVector.java 
and LazilyPersistentVector.java.) I hope there is interest as I see some 
advantages and no disadvantage.

Regards,
Paulo


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to