Re: Performance of seq on empty collections

David Sletten Mon, 15 Nov 2010 13:52:38 -0800

On Nov 15, 2010, at 4:41 PM, Alan wrote:

> Yes, the API *does* suggest using seq to check for emptiness. (empty?
> x) is implemented as (not (seq x)). You certainly won't ever get
> improved performance by using empty? - at best you break even, most of
> the time you lose. For example:
>


The only way the API could suggest using 'seq' to check for emptiness is by not 
having a function called 'empty?'. It's irrelevant that 'empty?' is merely 
implemented as (not (seq)). The function is there for a reason. Are you 
suggesting that it's deprecated? If you would like to focus on a premature 
optimization of this sort, go right ahead. I will stick with the more 
meaningful function name.  If I'm testing whether or not a sequence is empty I 
will use 'empty?'.
> 
> 
> Of course performance isn't usually the main driver, so if you feel
> empty? really is more expressive in your case, go for it. But the OP
> seems to care about performance, and suggesting empty? is off the
> mark.
> 

Apparently you didn't read what I wrote. I didn't suggest that using 'empty?' 
would solve the OP's performance issue. I simply pointed out that he was 
testing two opposite things.

> On Nov 14, 11:42 am, David Sletten <da...@bosatsu.net> wrote:
>> On Nov 14, 2010, at 2:16 PM, Eric Kobrin wrote:
>> 
>>> In the API it is suggested to use `seq` to check if coll is empty.
>> 
>> Your timing results raise some interesting questions, however, the API 
>> doesn't suggest using 'seq' to check if a collection is empty. That's what 
>> 'empty?' is for. The documentation note suggests (for style purposes 
>> apparently) that you use 'seq' to test that the collection is not empty. So 
>> to be precise you are testing two different things below. For instance, 
>> (identical? coll []) is true when coll is an empty vector. (seq coll) is 
>> true when coll is not empty. The correct equivalent would be to test (empty? 
>> coll).
>> 
>> Of course, this doesn't change the results. I get similar timings with 
>> empty?:
>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]
>>                                     (identical? [] []))) (time (dotimes [_ 
>> iterations] (empty? []))))
>> "Elapsed time: 2.294 msecs"
>> "Elapsed time: 2191.256 msecs"
>> nil
>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]             
>>                                                                              
>>                                                                              
>>                                        
>>                                            (identical? "" ""))) (time 
>> (dotimes [_ iterations] (empty? ""))))
>> "Elapsed time: 2.657 msecs"
>> "Elapsed time: 4654.622 msecs"
>> nil
>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]             
>>                                                                              
>>                                                                              
>>                                        
>>                                            (identical? () ()))) (time 
>> (dotimes [_ iterations] (empty? ()))))
>> "Elapsed time: 2.608 msecs"
>> "Elapsed time: 2144.142 msecs"
>> nil
>> 
>> This isn't so surprising though, considering that 'identical?' is the 
>> simplest possible test you could try--do two references point to the same 
>> object in memory? It can't get any more efficient than that.
>> 
>> Have all good days,
>> David Sletten
>> 
>> 
>> 
>>> I was working on some code recently found that my biggest performance
>>> bottleneck was calling `seq` to check for emptiness. The calls to
>>> `seq` were causing lots of object allocation and taking noticeable CPU
>>> time. I switched to using `identical?` to explicitly compare against
>>> the empty vector and was rewarded with a drastic reduction in
>>> execution time.
>> 
>>> Here are some hasty tests showing just how big the difference can be:
>> 
>>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]
>>> (identical? [] []))) (time (dotimes [_ iterations] (seq []))))
>>> "Elapsed time: 3.512 msecs"
>>> "Elapsed time: 2512.366 msecs"
>>> nil
>>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]
>>> (identical? "" ""))) (time (dotimes [_ iterations] (seq ""))))
>>> "Elapsed time: 3.898 msecs"
>>> "Elapsed time: 5607.865 msecs"
>>> nil
>>> user=> (let [iterations 100000000] (time (dotimes [_ iterations]
>>> (identical? () ()))) (time (dotimes [_ iterations] (seq ()))))
>>> "Elapsed time: 3.768 msecs"
>>> "Elapsed time: 2258.095 msecs"
>>> nil
>> 
>>> Has any thought been given to providing a faster `empty?` that is not
>>> based on seq?
>> 
>>> Thanks,
>>> Eric Kobrin
>> 
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clojure@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with 
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+unsubscr...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>> 
>> 
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Performance of seq on empty collections

Reply via email to