Ghadi,

Thanks for you posting - please see the reply I have just posted.

I'll give foldcat a look - sounds interesting.

With regards to my code for splicing maps/sets together efficiently, I 
spent some time looking at how that could be integrated with core.reducers 
and did some timings :

(def a (into [] (range 1000000)))
 
(do (time (reduce conj #{} a)) nil) ;; 9225 ms
(do (time (persistent! (reduce conj! (transient #{}) a))) nil) ;; 5981 ms
(do (time (into #{} a)) nil) ;; 6056 ms
 
(def n (/ (count a) (.availableProcessors (Runtime/getRuntime)))) ;; = 
625000
 
(do (time (r/fold (r/monoid into hash-set) conj a)) nil) ;; 9639 ms
(do (time (r/fold n (r/monoid into hash-set) conj a)) nil) ;; 6859 ms
(do (time (r/fold (r/monoid (fn [r l](clojure.lang.PersistentHashSet/splice 
r l)) hash-set) conj a)) nil) ;; 3654 ms
(do (time (r/fold n (r/monoid (fn [r 
l](clojure.lang.PersistentHashSet/splice r l)) hash-set) conj a)) nil) ;; 
3288 ms
 
(=  (into #{} a) (r/fold n (r/monoid (fn [r 
l](clojure.lang.PersistentHashSet/splice r l)) hash-set) conj a)) ;; = true

I am happy to note that using all cores I can reduce into a set faster than 
I can sequentially - 3288ms vs 5981ms - what you would hope for, but not 
always as easy to achieve as you would expect.

regards


Jules
 

On Friday, 21 February 2014 01:22:09 UTC, Ghadi Shayban wrote:
>
> Jules,
> For recombination of parallel reductions into a vector, have you looked at 
> foldcat<https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/reducers.clj#L314-L318>?
>   
> It works really well, and it's one of those wonderful gems in clojure.core 
> waiting to be noticed.  It gives you back a structure that is counted, 
> seqable, and re-foldable - which covers probably 99% of use cases.  The 
> returned structure is not indexed, that is intentional.
>
> It's definitely conceivable to have the equivalent of a 'transient merge' 
> op for maps, and a design doc on clojure 
> wiki<http://dev.clojure.org/dashboard.action>is definitely warranted.
>
> I echo the recommendation for RRB-trees when you need efficient 
> combination that returns an indexed structure.  If you don't, foldcat = 
> delicious.  Jean is right about the many ways a superstructure can degrade.
>
> Ghadi
>
> On Thursday, February 20, 2014 8:09:26 PM UTC-5, Jean Niklas L'orange 
> wrote:
>>
>> Hi Jules,
>>
>> On Thursday, February 20, 2014 11:59:03 PM UTC+1, Jules wrote:
>>>
>>> Subvec provides a view on a subseq of a vector that behaves like a full 
>>> vector. Supervec provides a similar view that makes two vectors behave like 
>>> a single one
>>>
>>
>> This data structure ("supervec") is usually known as a rope, and your 
>> implementation degrades the *~O(1)* to worst case *O(k)* time for 
>> lookups, assoc, conj and pop, where *k* is the amount of concatenations 
>> performed. A good rope implementation can reduce that factor down to *O(log 
>> k)* through balancing, but it will still break the performance 
>> guarantees a persistent vector is supposed to have. If you try to use this 
>> in the reducers library, the amount of concatenations fork/join performs 
>> might actually show notable performance degradation for those operations. 
>> Further, passing it around to code which might perform more concatenations 
>> will lead to even worse performance, which may be hard to debug for people 
>> not knowing Clojure internals.
>>
>> However, it is a very nice solution if you know there will be limited 
>> concatenations, that the concatenations result in a balanced tree, and you 
>> don't pass this vector around to other people. :)
>>
>> For a small discussion on this vs. RRB-trees, see Section 1.1 in 
>> http://infoscience.epfl.ch/record/169879/files/RMTrees.pdf.
>>
>> Hopefully I'm not destroying your fun playing around with these 
>> things—that is not the intent at all. I'm just saying that these things are 
>> (sadly?) harder than it first looks like.
>>
>> -- JN
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to