Hey Michael,

Since your eval solution essentially "cookie-cutters" out maps, each with 
the same keys, as fast as it can, I was playing around with what would 
happen if you used records, and I cobbled together something that appears 
to run twice as fast as the eval approach:

(defn read-to-structs [rows]
  (let [headers (->>
                  rows
                  first
                  (take-while (complement #{""}))
                  (map keyword))
        s (apply create-struct headers)]
    (for [row (rest rows)]
      (apply struct s row))))


Here are comparative timings:

(time-fn read-to-maps)
"Elapsed time: 4871.02175 msecs"
=> nil
(time-fn read-to-maps-partial)
"Elapsed time: 4814.730643 msecs"
=> nil
(time-fn read-to-maps-fn)
"Elapsed time: 4815.230087 msecs"
=> nil
(time-fn read-to-maps-eval)
"Elapsed time: 2466.048578 msecs"
=> nil
(time-fn read-to-structs)
"Elapsed time: 1273.462618 msecs"

I didn't test it too much, but it passed this:

(= (read-to-maps csv-fix) (read-to-structs csv-fix))
=> true

On Friday, October 10, 2014 4:21:01 PM UTC-4, Michael Blume wrote:
>
> https://github.com/MichaelBlume/eval-speed
>
> eval-speed.core=> (time-fn read-to-maps)
> "Elapsed time: 5551.011069 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-fn)
> "Elapsed time: 5587.256991 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-partial)
> "Elapsed time: 5606.649172 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-eval)
> "Elapsed time: 2627.521592 msecs"
> nil
>
> Ben, I'd still like to understand exactly what work the CPU is doing in 
> the uneval'd version that it's skipping in the eval'd version. It seems 
> like in the generated bytecode there's going to be *some* concept of 
> iterating through the row in either case, if only as part of the 
> destructuring process.
>
>
> On Friday, October 10, 2014 1:07:08 PM UTC-7, Ben wrote:
>>
>> I believe it's because the `mapper` function is just creating and 
>> returning a map literal. The "mapper" function in the evaled version is 
>> something like this:
>>
>> user> (def names '[n1 n2 n3 n4])
>> #'user/names
>> user> (def headers '[h1 h2 h3 h4])
>> #'user/headers
>> user> `(fn [[~@names]] ~(zipmap headers names))
>> (clojure.core/fn [[n1 n2 n3 n4]] {h4 n4, h3 n3, h2 n2, h1 n1})   ;; just 
>> a map literal, whose keys are already known.
>>
>> Whereas in the first version, zipmap has to be called, iterating over 
>> headers and names each time.
>>
>> On Fri, Oct 10, 2014 at 1:04 PM, Sean Corfield <se...@corfield.org> 
>> wrote:
>>
>>> It may be more to do with the difference between `for` and `map`. How do 
>>> these versions compare in your benchmark:
>>>
>>> (defn read-to-maps-partial [rows]
>>>   (let [headers (->>
>>>                   rows
>>>                   first
>>>                   (take-while (complement #{""}))
>>>                   (map keyword))]
>>>     (map (partial zipmap headers) (rest rows))))
>>>
>>> (defn read-to-maps-fn [rows]
>>>   (let [headers (->>
>>>                   rows
>>>                   first
>>>                   (take-while (complement #{""}))
>>>                   (map keyword))
>>>         mapper (fn [row] (zipmap headers row))]
>>>     (map mapper (rest rows))))
>>>
>>> Sean
>>>
>>> On Oct 10, 2014, at 11:42 AM, Michael Blume <blume...@gmail.com> wrote:
>>> > So I'm reading a bunch of rows from a huge csv file and marshalling 
>>> those rows into maps using the first row as keys. I wrote the function two 
>>> ways: https://gist.github.com/MichaelBlume/c67d22df0ff9c225d956 and the 
>>> version with eval is twice as fast and I'm kind of curious about why. 
>>> Presumably the eval'd function still implicitly contains a list of keys, 
>>> it's still implicitly treating each row as a seq and walking it, so I'm 
>>> wondering what the seq-destructuring and the map literal are doing under 
>>> the hood that's faster.
>>>
>>>
>>
>>
>> -- 
>> Ben Wolfson
>> "Human kind has used its intelligence to vary the flavour of drinks, 
>> which may be sweet, aromatic, fermented or spirit-based. ... Family and 
>> social life also offer numerous other occasions to consume drinks for 
>> pleasure." [Larousse, "Drink" entry]
>>
>>  

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to