Michal, do you know any resources/links that introduce the internals of local-clearing? or The Clojure source files which implemented this technique?
在 2013-4-23,下午3:55,xumingmingv <xumingmi...@gmail.com> 写道: > Thanks a lot Michal! > > 在 2013-4-21,上午6:51,Michał Marczyk <michal.marc...@gmail.com> 写道: > >> On 20 April 2013 23:41, Tonino Jankov <tyaa...@gmail.com> wrote: >>> I mean, I think that in both cases the original sequence at one point in >>> time must be, entirely realized, in memory. >> >> Well no, it doesn't. >> >> The original sequence is lazy and chunked, so it looks like a chain of >> links holding 32 elements each. It so happens that here it is iterated >> over in a chunk-oblivious manner, so it's not terribly inaccurate to >> simply think about it as a singly linked list. Calculating the length >> of such a list involves walking along it while keeping a running >> counter; clearly storing a reference to the head of the list >> throughout the process is not necessary, and indeed Clojure doesn't do >> it. >> >> Thus in the OOME-free case, the reference to the original sequence is >> thrown out almost immediately, followed by a reference to its "rest" >> part, followed by the reference to the "rest" of that etc. The >> throwing out happens inside drop-while at first and then inside the >> clojure.lang.RT.countFrom method. >> >> The key detail here is the way in which all references to d held by >> methods in the chain leading up to the call to countFrom are cleared >> before control is handed of to countFrom. The trick involved is known >> as "locals clearing"; I've hinted at how it works in the SO answer, >> see also the methods relevant here -- clojure.lang.RT.count, >> clojure.lang.Util.ret1, clojure.lang.RT.countFrom. >> >> A further clarification: t and d refer to two different lazy >> sequences, which are constructed by applying different transformations >> to a third sequence, which we have been referring to as "the original >> sequence". This is the huge sequence which doesn't fit in available >> memory. As it happens, while d is not the same object as the original >> sequence, it is a subsequence of the original sequence (from where the >> split-with predicate fails to the end), so it does share structure >> with it, so there is no "doubling". >> >> So, as mentioned previously, the key difference between the working >> and the non-working version is in when the reference to the original >> sequence hiding inside t gets cleared, as (count d) by itself doesn't >> require a live reference to either the original sequence or even d >> itself. >> >> Cheers, >> Michał >> >>> >>> And if there is no doubling of it in critical case, what is critical? >>> >>> If in (count t) (count d) - non.problematic- case original sequence also, at >>> one point, is, actually, in its entirety present in memory, it means that >>> memory can handle the whole collection. >>> >>> Maybe my questions sound a bit dubious, but anyway, I'm a bit sold out on >>> this lisp, so I want to get it right. >>> >>> >>> On 20 April 2013 23:33, Tonino Jankov <tyaa...@gmail.com> wrote: >>>> >>>> Marko, you say "There is no doubling: t and d share the same underlying >>>> lazy sequence and will refer to the same objects. The trouble is only that >>>> you force the evaluation of (count d) while (count t) still waits to be >>>> evaluated, so t must definitely stay bound to the head of the shared >>>> sequence.". >>>> >>>> But if there is no doubling, and single lazy sequence is in the memory in >>>> both cases, how does then memory have problem with one case and not with >>>> the >>>> other? >>>> If both t and d refer to the same (realized) object in memory. >>>> >>>> In both cases, to spit out t or d, the program must have it at one point >>>> in its memory. >>>> >>>> So what spends the EXTRA, critical, OOME memory in (count d) (count t) >>>> case? >>>> >>>> Or does it get instantly garbaged the moment it gets realized in (count t) >>>> (count d) case? >>>> >>>> Anyway, thanks for the exhaustive discussion, Marko & Michal >>>> >>>> >>>> >>>> On 18 April 2013 00:01, Michał Marczyk <michal.marc...@gmail.com> wrote: >>>>> >>>>> Note that the problem is not that t needs to hang around; it's that t >>>>> holds a lazy sequence which hangs around in unrealized state. That >>>>> lazy sequence internally holds a thunk -- a nullary function -- >>>>> capable of producing the actual sequence elements on request. It is >>>>> this thunk that holds a reference to the underlying huge sequence. >>>>> Once t is realized, the actual sequence gets cached and the thunk >>>>> becomes eligible for GC (the field holding it is set to null). If it >>>>> then needs to stay around for some other purpose, that is no problem: >>>>> >>>>> user=> (let [[t d] (split-with #(< % 12) (range 1e8))] [(count t) >>>>> (count d) (count t)]) >>>>> [12 99999988 12] >>>>> >>>>> (Or I suppose you could return [(count d) (count t)], but (dorun t) >>>>> before that.) >>>>> >>>>> Also, just to be explicit about this, calling (let [x >>>>> (produce-huge-seq)] (count x)) is not a problem, because x gets >>>>> cleared prior to control being handed off to count. >>>>> >>>>> I've also discussed the details of what's going on on SO, which is >>>>> where I first noticed this question: >>>>> >>>>> http://stackoverflow.com/questions/15994316/clojure-head-retention >>>>> >>>>> Cheers, >>>>> Michał >>>>> >>>>> >>>>> On 17 April 2013 22:53, Marko Topolnik <marko.topol...@gmail.com> wrote: >>>>>> On Monday, April 15, 2013 1:50:37 AM UTC+2, tyaakow wrote: >>>>>>> >>>>>>> Thank you for your response, Marko. >>>>>>> I want to clarify one more thing: >>>>>>> >>>>>>> (let [[t d] (split-with #(< % 12) (range 1e8))] >>>>>>> [(count d) (count t)]) >>>>>>> >>>>>>> >>>>>>> does this mean that while (count d) is realizing (range 1e8) seq, it >>>>>>> becomes (also) realized within t, therefore >>>>>>> it doubles (range 1e8) in memory causing OOME while (count d) is still >>>>>>> not >>>>>>> finished? >>>>>> >>>>>> >>>>>> There is no doubling: t and d share the same underlying lazy sequence >>>>>> and >>>>>> will refer to the same objects. The trouble is only that you force the >>>>>> evaluation of (count d) while (count t) still waits to be evaluated, so >>>>>> t >>>>>> must definitely stay bound to the head of the shared sequence. >>>>>> >>>>>>> >>>>>>> Also, you say "As count realizes one element after another, it doesn't >>>>>>> on >>>>>>> its own retain a reference to the past elements." >>>>>>> >>>>>>> Does this mean that, eg. in repl, when I do some (count xyz) and it >>>>>>> realizes xyz, It will later need to be reevaluated (realized again) if >>>>>>> I >>>>>>> require xyz within repl (I presume that if I require xyz later within >>>>>>> file, >>>>>>> it wont be GC due to it and clojure will know it shouldnt be GC) >>>>>> >>>>>> >>>>>> Be careful to observe that I say "doesn't on its own retain a reference >>>>>> to >>>>>> the past elements". If you have xyz bound to the head of your sequence, >>>>>> it >>>>>> will force the entire sequence to stay in memory for as long as xyz is >>>>>> within scope (if it's a local) or indefinitely (if it's a global def'd >>>>>> var). >>>>>> Generally, a lazy sequence never gets un-realized once it got >>>>>> realized---the >>>>>> only option is for it to disappear entirely (turn into garbage). >>>>>> >>>>>> -marko >>>>>> >>>>>> -- >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Clojure" group. >>>>>> To post to this group, send email to clojure@googlegroups.com >>>>>> Note that posts from new members are moderated - please be patient with >>>>>> your >>>>>> first post. >>>>>> To unsubscribe from this group, send email to >>>>>> clojure+unsubscr...@googlegroups.com >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/clojure?hl=en >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups >>>>>> "Clojure" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an >>>>>> email to clojure+unsubscr...@googlegroups.com. >>>>>> For more options, visit https://groups.google.com/groups/opt_out. >>>>>> >>>>>> >>>>> >>>>> -- >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Clojure" group. >>>>> To post to this group, send email to clojure@googlegroups.com >>>>> Note that posts from new members are moderated - please be patient with >>>>> your first post. >>>>> To unsubscribe from this group, send email to >>>>> clojure+unsubscr...@googlegroups.com >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/clojure?hl=en >>>>> --- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "Clojure" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send an >>>>> email to clojure+unsubscr...@googlegroups.com. >>>>> For more options, visit https://groups.google.com/groups/opt_out. >>>>> >>>>> >>>> >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clojure@googlegroups.com >>> Note that posts from new members are moderated - please be patient with your >>> first post. >>> To unsubscribe from this group, send email to >>> clojure+unsubscr...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google Groups >>> "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to clojure+unsubscr...@googlegroups.com. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with your >> first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> > -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.