Re: [Neko] call/cc

skaller Tue, 10 Jan 2006 15:23:35 -0800

On Tue, 2006-01-10 at 21:07 +0100, Nicolas Cannasse wrote:

> I found something interesting about this subject :
> 
> http://luajit.luaforge.net/coco.html
> 
> The only problem I see is that this will allocate a C stack per 
> corountine, hence it works nicely for iterators but might be too costly 
> for microthreads.


AMD64 partly alleviates this but there are 4 distinct problems
with stack switching:

1. It isn't portable, and almost certainly is totally and utterly
unacceptable in C++, Ocaml, or any other language than brain
dead C code. It isn't clear it works even in C: see eg Xavier's 
older LinuxThreads code which finds the current thread by checking 
the stack pointer, or Hans Boehms gc (currently used by Neko) which 
also depends on knowing something about the stack.

2. Unix has a stupid linear addressing model, and therefore
stacks have finite fixed-in-advance sizes. Note that VM means
this isn't memory you're using but address space .. provided
you're using mmap to allocate the stacks (or equivalent).

If you use malloc you're wasting (virtual) memory (malloc is not allowed
to use lazy memory allocation).

This problem totally kills all stack swapping solutions for 
user space threading on 32 machines. 64 bit machines have a
lot more address space.. at the moment anyhow :)

3. Performance. This depends on implementation technique and processor.
Setjump/longjmp is fairly slow, since it has to swap all
user machine registers, however this isn't a problem for processors
with few registers, or register frames. Obviously a dedicated
assembler routine that just swaps stacks is best (but less
portable).

4. It can't easily support thread chains/trees. The stacks
are all separate, so you can't link them together easily.
Mmap probably won't work, since the addresses on the stacks
are absolute.

IN theory a basic concurrency primitive is:

parallel { .. } { .. }

which just executes code in parallel .. each block has
its own stack frame, but they share the parent frame
and its parents. The thread join afterwards. So you have
a tree structured stack. Obviously, this CANNOT be done
by stack swapping. Instead you have to swap Virtual 
Memory pages .. which means at least rounding the frames
up to page boundaries. (This is what I mean by
tree structured stack).

A stack is basically a list of frames .. it needs to
be implemented as such -- with heap allocated blocks
linked by pointers. Using the VM to do it is a cheat
solution for a fundamentally broken language .. and
swapping the maps requires a kernel call .. which isn't
a recommended way to obtain high performance :)

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


--
Neko : One VM to run them all (http://nekovm.org)

Re: [Neko] call/cc

Reply via email to