Thanks a lot Michal!

在 2013-4-21,上午6:51,Michał Marczyk <michal.marc...@gmail.com> 写道:

> On 20 April 2013 23:41, Tonino Jankov <tyaa...@gmail.com> wrote:
>> I mean, I think that in both cases the original sequence at one point in
>> time must be, entirely realized, in memory.
> 
> Well no, it doesn't.
> 
> The original sequence is lazy and chunked, so it looks like a chain of
> links holding 32 elements each. It so happens that here it is iterated
> over in a chunk-oblivious manner, so it's not terribly inaccurate to
> simply think about it as a singly linked list. Calculating the length
> of such a list involves walking along it while keeping a running
> counter; clearly storing a reference to the head of the list
> throughout the process is not necessary, and indeed Clojure doesn't do
> it.
> 
> Thus in the OOME-free case, the reference to the original sequence is
> thrown out almost immediately, followed by a reference to its "rest"
> part, followed by the reference to the "rest" of that etc. The
> throwing out happens inside drop-while at first and then inside the
> clojure.lang.RT.countFrom method.
> 
> The key detail here is the way in which all references to d held by
> methods in the chain leading up to the call to countFrom are cleared
> before control is handed of to countFrom. The trick involved is known
> as "locals clearing"; I've hinted at how it works in the SO answer,
> see also the methods relevant here -- clojure.lang.RT.count,
> clojure.lang.Util.ret1, clojure.lang.RT.countFrom.
> 
> A further clarification: t and d refer to two different lazy
> sequences, which are constructed by applying different transformations
> to a third sequence, which we have been referring to as "the original
> sequence". This is the huge sequence which doesn't fit in available
> memory. As it happens, while d is not the same object as the original
> sequence, it is a subsequence of the original sequence (from where the
> split-with predicate fails to the end), so it does share structure
> with it, so there is no "doubling".
> 
> So, as mentioned previously, the key difference between the working
> and the non-working version is in when the reference to the original
> sequence hiding inside t gets cleared, as (count d) by itself doesn't
> require a live reference to either the original sequence or even d
> itself.
> 
> Cheers,
> Michał
> 
>> 
>> And if there is no doubling of it in critical case, what is critical?
>> 
>> If in (count t) (count d) - non.problematic- case original sequence also, at
>> one point, is, actually, in its entirety present in memory, it means that
>> memory can handle the whole collection.
>> 
>> Maybe my questions sound a bit dubious, but anyway, I'm a bit sold out on
>> this lisp, so I want to get it right.
>> 
>> 
>> On 20 April 2013 23:33, Tonino Jankov <tyaa...@gmail.com> wrote:
>>> 
>>> Marko, you say "There is no doubling: t and d share the same underlying
>>> lazy sequence and will refer to the same objects. The trouble is only that
>>> you force the evaluation of (count d) while (count t) still waits to be
>>> evaluated, so t must definitely stay bound to the head of the shared
>>> sequence.".
>>> 
>>> But if there is no doubling, and single lazy sequence is in the memory in
>>> both cases, how does then memory have problem with one case and not with the
>>> other?
>>> If both t and d refer to the same (realized) object in memory.
>>> 
>>> In both cases, to spit out t or d, the program must have it at one point
>>> in its memory.
>>> 
>>> So what spends the EXTRA, critical, OOME memory in (count d) (count t)
>>> case?
>>> 
>>> Or does it get instantly garbaged the moment it gets realized in (count t)
>>> (count d) case?
>>> 
>>> Anyway, thanks for the exhaustive discussion, Marko & Michal
>>> 
>>> 
>>> 
>>> On 18 April 2013 00:01, Michał Marczyk <michal.marc...@gmail.com> wrote:
>>>> 
>>>> Note that the problem is not that t needs to hang around; it's that t
>>>> holds a lazy sequence which hangs around in unrealized state. That
>>>> lazy sequence internally holds a thunk -- a nullary function --
>>>> capable of producing the actual sequence elements on request. It is
>>>> this thunk that holds a reference to the underlying huge sequence.
>>>> Once t is realized, the actual sequence gets cached and the thunk
>>>> becomes eligible for GC (the field holding it is set to null). If it
>>>> then needs to stay around for some other purpose, that is no problem:
>>>> 
>>>> user=> (let [[t d] (split-with #(< % 12) (range 1e8))] [(count t)
>>>> (count d) (count t)])
>>>> [12 99999988 12]
>>>> 
>>>> (Or I suppose you could return [(count d) (count t)], but (dorun t)
>>>> before that.)
>>>> 
>>>> Also, just to be explicit about this, calling (let [x
>>>> (produce-huge-seq)] (count x)) is not a problem, because x gets
>>>> cleared prior to control being handed off to count.
>>>> 
>>>> I've also discussed the details of what's going on on SO, which is
>>>> where I first noticed this question:
>>>> 
>>>> http://stackoverflow.com/questions/15994316/clojure-head-retention
>>>> 
>>>> Cheers,
>>>> Michał
>>>> 
>>>> 
>>>> On 17 April 2013 22:53, Marko Topolnik <marko.topol...@gmail.com> wrote:
>>>>> On Monday, April 15, 2013 1:50:37 AM UTC+2, tyaakow wrote:
>>>>>> 
>>>>>> Thank you for your response, Marko.
>>>>>> I want to clarify one more thing:
>>>>>> 
>>>>>> (let [[t d] (split-with #(< % 12) (range 1e8))]
>>>>>>    [(count d) (count t)])
>>>>>> 
>>>>>> 
>>>>>> does this mean that while (count d) is realizing (range 1e8) seq, it
>>>>>> becomes (also) realized within t, therefore
>>>>>> it doubles (range 1e8) in memory causing OOME while (count d) is still
>>>>>> not
>>>>>> finished?
>>>>> 
>>>>> 
>>>>> There is no doubling: t and d share the same underlying lazy sequence
>>>>> and
>>>>> will refer to the same objects. The trouble is only that you force the
>>>>> evaluation of (count d) while (count t) still waits to be evaluated, so
>>>>> t
>>>>> must definitely stay bound to the head of the shared sequence.
>>>>> 
>>>>>> 
>>>>>> Also, you say "As count realizes one element after another, it doesn't
>>>>>> on
>>>>>> its own retain a reference to the past elements."
>>>>>> 
>>>>>> Does this mean that, eg. in repl, when I do some (count xyz) and it
>>>>>> realizes xyz, It will later need to be reevaluated (realized again) if
>>>>>> I
>>>>>> require xyz within repl (I presume that if I require xyz later within
>>>>>> file,
>>>>>> it wont be GC due to it and clojure will know it shouldnt be GC)
>>>>> 
>>>>> 
>>>>> Be careful to observe that I say "doesn't on its own retain a reference
>>>>> to
>>>>> the past elements". If you have xyz bound to the head of your sequence,
>>>>> it
>>>>> will force the entire sequence to stay in memory for as long as xyz is
>>>>> within scope (if it's a local) or indefinitely (if it's a global def'd
>>>>> var).
>>>>> Generally, a lazy sequence never gets un-realized once it got
>>>>> realized---the
>>>>> only option is for it to disappear entirely (turn into garbage).
>>>>> 
>>>>> -marko
>>>>> 
>>>>> --
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Clojure" group.
>>>>> To post to this group, send email to clojure@googlegroups.com
>>>>> Note that posts from new members are moderated - please be patient with
>>>>> your
>>>>> first post.
>>>>> To unsubscribe from this group, send email to
>>>>> clojure+unsubscr...@googlegroups.com
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/clojure?hl=en
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups
>>>>> "Clojure" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an
>>>>> email to clojure+unsubscr...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>> 
>>>>> 
>>>> 
>>>> --
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Clojure" group.
>>>> To post to this group, send email to clojure@googlegroups.com
>>>> Note that posts from new members are moderated - please be patient with
>>>> your first post.
>>>> To unsubscribe from this group, send email to
>>>> clojure+unsubscr...@googlegroups.com
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/clojure?hl=en
>>>> ---
>>>> You received this message because you are subscribed to the Google Groups
>>>> "Clojure" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to clojure+unsubscr...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>> 
>>>> 
>>> 
>> 
>> --
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with your
>> first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to clojure+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 
> -- 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to