Re: Problem with calling count on large lazy sequence?

[EMAIL PROTECTED] Sun, 07 Dec 2008 12:41:17 -0800

On Dec 7, 5:20 am, Meikel Brandmeyer <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Am 07.12.2008 um 09:55 schrieb [EMAIL PROTECTED]:
>
> > This solution worked for me and you can see my patch at the following
> > url.
>
> >http://paste.lisp.org/display/71744
>
> This is of course no solution to the problem, but if you don't want
> to start patching Clojure you can have a workaround for seqs:
> (reduce (fn [x _] (inc x)) 0 the-seq).
>
> > Keep in mind I just whipped this up to test Chouser's idea.  I'm
> > completely new to the Clojure code base so this is a naive fix at
> > best.  I noticed that quite a few classes seem to fall under the
> > IPersistentCollection umbrella so I'm sure the real patch is bigger
> > than this.  Also the array only needs to be of size 1, obviously.
>
> I'm not sure, that this is a general problem. Vectors, maps, sets
> and lists have all an O(1) count. So I presume the have this stored
> in some instance variable somewhere.
>
> For general seqs however, this is not true. Eg. a line seq doesn't
> know the number of lines in advance. A seq originating from a
> collection however, could provide also an O(1) count.
>
> Or am I mistaken?
>
> Sincerely
> Meikel
>
>  smime.p7s
> 5KViewDownload

You are correct, this is not a general problem.  Most things (ie non-
lazy things) have a member field that keeps their size count so that
the operation is always O(1).  In fact, this is typically a public
field that can be accessed rather than a method.  In lazy land this
obviously will not work because the only way to get the count is to
fully realize the sequence and the whole point of being lazy is to not
preemptively do things like tally the size.  Obviously you understand
all this, but I thought I'd reiterate it for my own purposes.

However, when we do decide to fully realize the sequence I think we
should be able to do it without Clojure implicitly keeping a reference
(ie a Java reference) to the head.  I would think I should be able to
call count, or map, reduce, or whatever without incurring the cost of
keeping the whole sequence in memory at one time.  It should be just
like a read-only Iterator.  Heck, I should be able to call count on a
centillion sized sequence if I want (except for the fact that the
result will be incorrect since count returns an int).

As for solutions, Chouser's first stab at it was to do the count of
ASeq right in the RT.count method.  The more I think about it the more
I like that solution over the one I posted before.  Why does the count
have to be a member function of the sequence?  However, it seems that
ASeq count is called from other areas besides RT.count, and I'm not
sure what those are.

I posted another patch (below) that uses the ASeq's Iterator.  I still
think this is a shallow fix but I think I like it better than what I
posted before.  My ignorance of the code base, and the fact that I'm
not Joshua Block impede my ability to come up with anything better
just yet.  I'm hoping some more people will read this thread and offer
their input.

http://paste.lisp.org/display/71773

-Ryan
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---
Re: Problem with calling count on large lazy sequence?

Reply via email to