On Thu, Oct 3, 2013 at 12:40 PM, Joachim Durchholz <[email protected]> wrote:
> Am 03.10.2013 17:04, schrieb Ondřej Čertík:
>>
>> On Thu, Oct 3, 2013 at 1:50 AM, Joachim Durchholz <[email protected]>
>> wrote:
>>>
>>> Am 03.10.2013 07:46, schrieb Ondřej Čertík:
>>
>> [...]
>>>>>
>>>>> Garbage management is a real issue in C++.
>>>>
>>>>
>>>> Currently I use reference counting. I don't see any issue --- can you
>>>> elaborate
>>>> on what you mean?
>>>
>>>
>>> See http://www.hpl.hp.com/personal/Hans_Boehm/gc/issues.html for issues
>>> with
>>> and around the BDW collector.
>>>
>>> There's also the seminal Zorn paper "Memory Allocation Costs in Large C
>>> and
>>> C++ Programs" [...]
>>
>>
>> I see, thanks for the pointers. They are a bit old, so maybe things
>> have improved since then.
>
>
> I think both mark&sweep and reference-counting approaches have improved
> since that paper, but the relative performance is still the same.
>
> RC does have its niches, but the problem is that as applications evolve,
> they tend to leave the RC-friendly niche and then you need to rewrite all
> the allocation code because the RC scheme is visible over all of the code.
> Hmm... okay, you could probably stub out whatever SmartPointer<> class
> you're using, dropping all the reference counting code, the variables, and
> the free calls, and see what the performance is. Your code might still be
> geared towards being RC-friendly instead of mark&sweep-friendly, but it
> would give at least a preliminary result.
>
> In fact it would be interesting to see how the performance figures have
> evolved since the papers were written.

Good ideas --- here is the RCP class that I created for CSymPy:

https://github.com/certik/csympy/blob/master/src/csympy_rcp.h#L59

$ ./expand2
Expanding: ((y + x + z + w)^15 + w)*(y + x + z + w)^15
1217ms
number of terms: 6272


So I deleted all reference counting and deleting of objects (patch
here https://gist.github.com/certik/6829650)
--- but I left the RCP class for now. Just to get an idea.

$ ./expand2
Expanding: ((y + x + z + w)^15 + w)*(y + x + z + w)^15
1382ms
number of terms: 6272


And things got slower... Haha, that was unexpected. Why would that be?

>> In any case, I feel reference counting is more robust/predictable, so
>> that's what I currently use in CSymPy,
>
>
> Heh. Actally, you're just exchanging one kind of unpredictability with
> another.
> With a mark&sweep collector, you don't know when a block will be
> deallocated.
> With an RC collector, you don't know how much time a pointer update will
> take, because the pointer may be the last reference to a large network of
> objects and the update might cause an avalanche of count updates and free()
> calls.
>
> The RC collector scene is aware of the problem, and they're trying to remedy
> this by deferring free() calls - but that means you don't know when a block
> will be deallocated, so we're back to the original behaviour, just less
> efficient (because reference count updates cost time and have horrible cache
> locality).
>
> You can make a mark&sweep collector concurrent, or integrate it into a
> malloc() function that does a bit of GC work on each call. With that
> approach, you can tune the GC to the minimal level of activity that prevents
> the heap from growing. You could make it a setting that can be tuned to
> whatever use case you have - switch GC off entirely by default when in the
> interactive loop, and letting it kick in if a command turns out to eat
> memory. Or have it trade speed vs. memory usage - a higher GC activity will
> be slower but take less memory.
>
> Also, RC collectors cannot handle cycles. There are various approaches to
> remedy this, but all that I have seen just graft a standard mark&sweep
> collector on top of the RC machinery - I doubt that that's going to be more
> efficient than having a mark&sweep collector right from the start.
>
> Outside of the C++ community, RC is generally discredited as a dead end. I
> don't know how much of that is based on fact vs. hearsay. Also, RC might be
> the best alternative for C++'s semantics anyway... the Zorn paper says
> otherwise.
> The tl;dr variant of the Zorn paper is: replace malloc() with calls into a
> garbage-collecting memory management, cancel free() out via the
> preprocessor, and the majority of programs speeds up, a few slow down.
>
> There's also consensus that C++'s concept of "memory deallocation is
> resource deinitialization" does not translate to a garbage-collected
> environment at all. Python uses context handlers, and I consider that
> approach really solid (Java recently acquired something similar).
> I guess you'd want to create a ContextHandler class for your C++ code since
> you can't rely on free() being called anymore.
> This aspect isn't thoroughly covered in the Zorn paper IIRC.

Based on my experience, one always has to try various approaches and
pick the fastest.
RC has been pretty robust in my experience, but I will try some GC approaches
as well. I feel there is at least 10% maybe more potential for speedup thanks to
better memory management.

Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to