On Sun, Jun 14, 2009 at 22:47, Norbert Nemec <[email protected]>wrote:

> I must admit that my statement is mostly theoretical, based on the
> profiling data, looking at the code and realizing how many steps there are
> that the compiler cannot possibly optimize away. Making a prediction about
> absolute gain is hard without trying.
>
> What did David do? Throw out reference counting completely? How much effort
> is that? I would like to try it out myself and do some actual measurements.


I removed the reference counting code entirely and replaced it by libgc.

That was not a huge effort. As I remember, the hardest part was auditing the
code for cases were timely destruction of heap objects was important. IIRC
there were a few such occurences around the Editor and UI code.

What disadvantages of garbage collection would be significant to TeXmacs?
> The main problem that I know of is that the memory requirements may be
> somewhat higher due to delayed deallocation. That, however, may easily be
> outweighed by the size-reduction of all objects by the counter itself.


>From memory, the transition caused approximately 2x increase in memory usage
and a 2x slowdown in large operations such as the compilation of the
complete documentation. The slowdown in interactive use was noticeable too.

Memory usage patterns that makes it tricky to support by a generic GC: it
allocates vast numbers of very small objects (mainly tree_rep and
string_rep), many of which are long lived. Presumably a generational GC
would perform well, but IIRC, libgc was not generational at the time.

It is entirely possible that better generic GC services are available today
that would provide better results.

How would const references help avoiding reference counting?


A large amount of the no-op reference counting done by texmacs occurs for
method parameters: when a counted pointer is passed into a method call, it
is incremented when entering the method, and decremented when unwinding the
method's stack frame. This is particularly bad in tree-traversal code when
all tree_rep objects in a tree are touched on recursive descent and then
again on unwinding, causing many cache misses.

Presumably, good speedups could be achieved by using uncounted pointers when
the compiler could prove that 1. the callees do not store the pointer in
data structures 2. the pointer cannot be returned up the call stack above
the caller that own the last counted reference to the object.

I failed to find a way to get the C++ compiler to prove point 2, that did
not require major changes to the texmacs code, so I gave up on this
approach. Some of the newer C++ 0x language features might make it possible
to solve this problem.

While we speak of memory management, I had suspected for a long time that
one cause of texmacs slowness is bad time-space locality for the omnipresent
tree-string structure. IIRC, the small-objects allocator of texmacs uses a
naive LIFO free-list approach to reusing memory. I suspect that it leads to
tree structures where neighboring nodes are scattered in memory.

One project that could potentially have large benefits in reducing memory
management overhead and cache misses would be to use contiguous buffers for
both tree and string blocks forming a single structure. The blocks
themselves would not be managed (no GC or refcounting), and the structure
would be periodically repacked (maybe using a copying gc approach) to
reclaim memory add inserted blocks at the ideal place in the structure.
_______________________________________________
Texmacs-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/texmacs-dev

Reply via email to