Michal, do you know any resources/links that introduce the internals of 
local-clearing? or The Clojure source files which implemented this technique?

在 2013-4-23,下午3:55,xumingmingv <xumingmi...@gmail.com> 写道:

> Thanks a lot Michal!
> 
> 在 2013-4-21,上午6:51,Michał Marczyk <michal.marc...@gmail.com> 写道:
> 
>> On 20 April 2013 23:41, Tonino Jankov <tyaa...@gmail.com> wrote:
>>> I mean, I think that in both cases the original sequence at one point in
>>> time must be, entirely realized, in memory.
>> 
>> Well no, it doesn't.
>> 
>> The original sequence is lazy and chunked, so it looks like a chain of
>> links holding 32 elements each. It so happens that here it is iterated
>> over in a chunk-oblivious manner, so it's not terribly inaccurate to
>> simply think about it as a singly linked list. Calculating the length
>> of such a list involves walking along it while keeping a running
>> counter; clearly storing a reference to the head of the list
>> throughout the process is not necessary, and indeed Clojure doesn't do
>> it.
>> 
>> Thus in the OOME-free case, the reference to the original sequence is
>> thrown out almost immediately, followed by a reference to its "rest"
>> part, followed by the reference to the "rest" of that etc. The
>> throwing out happens inside drop-while at first and then inside the
>> clojure.lang.RT.countFrom method.
>> 
>> The key detail here is the way in which all references to d held by
>> methods in the chain leading up to the call to countFrom are cleared
>> before control is handed of to countFrom. The trick involved is known
>> as "locals clearing"; I've hinted at how it works in the SO answer,
>> see also the methods relevant here -- clojure.lang.RT.count,
>> clojure.lang.Util.ret1, clojure.lang.RT.countFrom.
>> 
>> A further clarification: t and d refer to two different lazy
>> sequences, which are constructed by applying different transformations
>> to a third sequence, which we have been referring to as "the original
>> sequence". This is the huge sequence which doesn't fit in available
>> memory. As it happens, while d is not the same object as the original
>> sequence, it is a subsequence of the original sequence (from where the
>> split-with predicate fails to the end), so it does share structure
>> with it, so there is no "doubling".
>> 
>> So, as mentioned previously, the key difference between the working
>> and the non-working version is in when the reference to the original
>> sequence hiding inside t gets cleared, as (count d) by itself doesn't
>> require a live reference to either the original sequence or even d
>> itself.
>> 
>> Cheers,
>> Michał
>> 
>>> 
>>> And if there is no doubling of it in critical case, what is critical?
>>> 
>>> If in (count t) (count d) - non.problematic- case original sequence also, at
>>> one point, is, actually, in its entirety present in memory, it means that
>>> memory can handle the whole collection.
>>> 
>>> Maybe my questions sound a bit dubious, but anyway, I'm a bit sold out on
>>> this lisp, so I want to get it right.
>>> 
>>> 
>>> On 20 April 2013 23:33, Tonino Jankov <tyaa...@gmail.com> wrote:
>>>> 
>>>> Marko, you say "There is no doubling: t and d share the same underlying
>>>> lazy sequence and will refer to the same objects. The trouble is only that
>>>> you force the evaluation of (count d) while (count t) still waits to be
>>>> evaluated, so t must definitely stay bound to the head of the shared
>>>> sequence.".
>>>> 
>>>> But if there is no doubling, and single lazy sequence is in the memory in
>>>> both cases, how does then memory have problem with one case and not with 
>>>> the
>>>> other?
>>>> If both t and d refer to the same (realized) object in memory.
>>>> 
>>>> In both cases, to spit out t or d, the program must have it at one point
>>>> in its memory.
>>>> 
>>>> So what spends the EXTRA, critical, OOME memory in (count d) (count t)
>>>> case?
>>>> 
>>>> Or does it get instantly garbaged the moment it gets realized in (count t)
>>>> (count d) case?
>>>> 
>>>> Anyway, thanks for the exhaustive discussion, Marko & Michal
>>>> 
>>>> 
>>>> 
>>>> On 18 April 2013 00:01, Michał Marczyk <michal.marc...@gmail.com> wrote:
>>>>> 
>>>>> Note that the problem is not that t needs to hang around; it's that t
>>>>> holds a lazy sequence which hangs around in unrealized state. That
>>>>> lazy sequence internally holds a thunk -- a nullary function --
>>>>> capable of producing the actual sequence elements on request. It is
>>>>> this thunk that holds a reference to the underlying huge sequence.
>>>>> Once t is realized, the actual sequence gets cached and the thunk
>>>>> becomes eligible for GC (the field holding it is set to null). If it
>>>>> then needs to stay around for some other purpose, that is no problem:
>>>>> 
>>>>> user=> (let [[t d] (split-with #(< % 12) (range 1e8))] [(count t)
>>>>> (count d) (count t)])
>>>>> [12 99999988 12]
>>>>> 
>>>>> (Or I suppose you could return [(count d) (count t)], but (dorun t)
>>>>> before that.)
>>>>> 
>>>>> Also, just to be explicit about this, calling (let [x
>>>>> (produce-huge-seq)] (count x)) is not a problem, because x gets
>>>>> cleared prior to control being handed off to count.
>>>>> 
>>>>> I've also discussed the details of what's going on on SO, which is
>>>>> where I first noticed this question:
>>>>> 
>>>>> http://stackoverflow.com/questions/15994316/clojure-head-retention
>>>>> 
>>>>> Cheers,
>>>>> Michał
>>>>> 
>>>>> 
>>>>> On 17 April 2013 22:53, Marko Topolnik <marko.topol...@gmail.com> wrote:
>>>>>> On Monday, April 15, 2013 1:50:37 AM UTC+2, tyaakow wrote:
>>>>>>> 
>>>>>>> Thank you for your response, Marko.
>>>>>>> I want to clarify one more thing:
>>>>>>> 
>>>>>>> (let [[t d] (split-with #(< % 12) (range 1e8))]
>>>>>>>   [(count d) (count t)])
>>>>>>> 
>>>>>>> 
>>>>>>> does this mean that while (count d) is realizing (range 1e8) seq, it
>>>>>>> becomes (also) realized within t, therefore
>>>>>>> it doubles (range 1e8) in memory causing OOME while (count d) is still
>>>>>>> not
>>>>>>> finished?
>>>>>> 
>>>>>> 
>>>>>> There is no doubling: t and d share the same underlying lazy sequence
>>>>>> and
>>>>>> will refer to the same objects. The trouble is only that you force the
>>>>>> evaluation of (count d) while (count t) still waits to be evaluated, so
>>>>>> t
>>>>>> must definitely stay bound to the head of the shared sequence.
>>>>>> 
>>>>>>> 
>>>>>>> Also, you say "As count realizes one element after another, it doesn't
>>>>>>> on
>>>>>>> its own retain a reference to the past elements."
>>>>>>> 
>>>>>>> Does this mean that, eg. in repl, when I do some (count xyz) and it
>>>>>>> realizes xyz, It will later need to be reevaluated (realized again) if
>>>>>>> I
>>>>>>> require xyz within repl (I presume that if I require xyz later within
>>>>>>> file,
>>>>>>> it wont be GC due to it and clojure will know it shouldnt be GC)
>>>>>> 
>>>>>> 
>>>>>> Be careful to observe that I say "doesn't on its own retain a reference
>>>>>> to
>>>>>> the past elements". If you have xyz bound to the head of your sequence,
>>>>>> it
>>>>>> will force the entire sequence to stay in memory for as long as xyz is
>>>>>> within scope (if it's a local) or indefinitely (if it's a global def'd
>>>>>> var).
>>>>>> Generally, a lazy sequence never gets un-realized once it got
>>>>>> realized---the
>>>>>> only option is for it to disappear entirely (turn into garbage).
>>>>>> 
>>>>>> -marko
>>>>>> 
>>>>>> --
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Clojure" group.
>>>>>> To post to this group, send email to clojure@googlegroups.com
>>>>>> Note that posts from new members are moderated - please be patient with
>>>>>> your
>>>>>> first post.
>>>>>> To unsubscribe from this group, send email to
>>>>>> clojure+unsubscr...@googlegroups.com
>>>>>> For more options, visit this group at
>>>>>> http://groups.google.com/group/clojure?hl=en
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups
>>>>>> "Clojure" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>>> an
>>>>>> email to clojure+unsubscr...@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Clojure" group.
>>>>> To post to this group, send email to clojure@googlegroups.com
>>>>> Note that posts from new members are moderated - please be patient with
>>>>> your first post.
>>>>> To unsubscribe from this group, send email to
>>>>> clojure+unsubscr...@googlegroups.com
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/clojure?hl=en
>>>>> ---
>>>>> You received this message because you are subscribed to the Google Groups
>>>>> "Clojure" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>>> email to clojure+unsubscr...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>> 
>>>>> 
>>>> 
>>> 
>>> --
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clojure@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with your
>>> first post.
>>> To unsubscribe from this group, send email to
>>> clojure+unsubscr...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google Groups
>>> "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to clojure+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>> 
>>> 
>> 
>> -- 
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with your 
>> first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clojure+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to