There's an open problem here wrt to garbage collection. Felix gc system is basically designed as a 'semi-automatic' memory recovery system that's a bit smarter than C++ destructors and manual deletion, in that it allows one to cleanup memory without too much work keeping track of how to do so.
It is characterised by: (a) world stop for multi-processor (b) high memory overhead per block (currently 6 machine words) (c) can only be called inside the driver, that is, not in any functional code, because functions use the machine stack, and, the machine stack is not accessible to the collector (d) naive mark/sweep style exact algorithm (e) more or less ISO C++ compliant and this machine independent For managing Felix fibres the collector is probably very good: it works well with the large stack frames which tend to be produced by inlining. Fibres are stackless so (c) isn't a problem. In addition, one can easily use, for example, a wrapped C++ STL vector of integers without needing any GC support at all -- so the demands on the GC in programs can be as low as zero. Even when garbage is spewed, for many short scripts, memory is not exhausted and there's no need to worry about collection: almost all the regression and tutorial tests fall into this category. But for small objects in functional code, particular variants with small arguments (eg integer), or lists of small objects, it isn't likely to deliver good performance .. and the fact it can't be called in heavily functional code means a significant garbage tip is required to hold the garbage until it can be collected. So, I'm looking for some solutions. The Boehm collector is somewhat expensive, but current versions support both incremental collection and extremely low memory overhead. Unfortunately it doesn't work so well with C++, is plagued with portability problems, and depends on various invariants which prevent, for example digital trees being used. A dedicated high performance concurrent or even parallel collector may be possible, but it isn't clear Felix can have enough control to allow it to operate without throwing out C/C++ object model compatibility constraints. Instead, my thoughts are a type based hybrid system. Currently every object has a header containing an RTTI pointer, two links securing it in a double linked list of all allocated blocks, an element count (even if it isn't an array), and some flags. However why not change the scheme for small objects somehow? The idea is basically that all blocks are aligned on 2^N word boundary, where N=1 on almost all machines, and malloc typically delivers N=4 (16 byte boundary). So we could have 16 schemas encoded in pointers at the cost of masking these bits to zero before dereferencing them. So we might consider for example a dedicated 'small object' system where all the objects of a given type were stored in an array with a bitmap usage technique, giving only a one bit per object overhead. Note that those N bits are extremely valuable: they can also be used to encode variants with small number of cases -- which is most of them -- halving storage overhead for arguments. Note also: it is tricky to do all of this because the code generated is NOT allowed to depend on such low level details: the generated C++ has to be portable. Therefore those details have to represented by C macros, inline functions, and other C++ level techniques -- the system library doesn't have to be machine independent (indeed it exists to allow the generated C++ and more generally Felix code to be so). Another idea is to ALLOW gc during functional code by using a conservative technique to analyse the stack. At least in a single threaded application, each stack word could be checked to see if it pointed at a heap block: the set of such blocks is definite and known. This requires stack hackery finding the stack, which is quite problematic, particularly in the presence of threads .. but it can probably be done on most processors. Unlike the Boehm collector, we only need to examine the stack from the driver entry to Felix code up to the stack top, we don't actually need the 'real' base of the stack. There are of course some caveats .. eg AMD64 on Linux has a 'red zone' which can be used despite being past the stack pointer. Clearly the doubly linked list thing would be not good here, we'd need some kind of digital encoding (trie) such as a Judy array to be able to determine if a machine word could be a heap pointer fast enough to make this approach practical. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language