I'll do the easy but ugly top posting thing....

I understand your model, and as usual you have explained your model very
clearly.  But I think there are some assumptions in your model that need
to be addressed.

Your argument is basically that it's hard to allocate multiple stacks in
a single address space environment.  I agree with that.  But that's not
the problem.  The problem I think is detecting stack overflows in the
first place.

Your model appears to be based on a MC68000 architecture, with no memory
management unit.  One flat address space for everyone to play in.  And
perhaps as an implementation goal of Tcl, this is the model they choose
to program to.  I can certainly understand that.  It's simple, easy to
debug -- compelling.  And yes, much of the world of C/Unix was developed
for such a flat space processor.

But there is no real need for multi-threaded programs to only have a
single address space.  I think we all want the multiple threads to be
able to access any of the structures or code with another thread, but
that still doesn't require a single address space.

But again, the problem is not allocation of stacks, but detection of
stack overflow.

Yeah, Multics and the 80x86 had segments.  Multics was great, it had
paged segments.  So I would bet if one could write to a segmented
architecture, it would be natural to place each stack in its own
segment, and let the mmu detect stack overflows.  And yeah, in the early
smaller intels, we all got annoyed with having to load up various
registers with appropriate segment pointers to differentiate code and data.

Even in a flat architecture, in a machine with an mmu, all of this ill
memory access should be detectable.  As I mentioned on the DEC-10, and I
am pretty sure on VAX/VMS you could arbitrarily protect specific pages
against writes AND reads.  Any thread trying to read or write into a
page generates a fault.  I would think it would be a good idea to
separate each of those stack frames with a one or more protected pages.
   It might not solve the problem of one thread still corrupting
another thread's memory or stack, but assuming a page was larger than
several stack frames, it would certainly catch most typical stack
overflows.  Wouldn't it?  (Do mmus only care about interprocess access
and cannot catch intraprocess accesses?)

But again, maybe I'm missing something here.  When I was writing
compilers for some small functional languages, I knew, or rather the
compiler knew at every subroutine push, just how large the new stack
frame had to be.  It would seem to be pretty easy math, and not too
harsh of an efficiency hit to have the compiler/interpreter check
available stack size at each function entry.  And at runtime if knew
variables are consed onto the stack, again it should be simple during
initial allocation of the variable by an interpreter to check the stack
size.

So what am I missing?  I still don't understand why it isn't relatively
easy to know when the stack is exhausted.

Forget stack extension.  I will be happy with a known error returning to
the programmer: stack overflow.  We could catch that one.  We don't even
get that.  We get: <cracked windshield, brown screen, > game over dude,
segv.  That's unsafe!

I would bet that if one can detect stack overflows reliably, then one
can also implement an algorithm to safely extend the stack.  It might
have other performance impacts though.

To sum up, I still do not see any technical reason for not being able to
detect stack overflow other than management goals in the implementation
of Tcl and the one possibly technical goal of trading off a level of
safety vs. a level  of performance.

But hell, I know little about this, so please do clue me in.


Jerry

Rob Mayoff wrote:

+---------- On Feb 11, Jerry Asher said:

>Are you really saying you cannot imagine how in a multithreaded
>environment one can automatically and efficiently extend the stack?


Depends on what you mean by "extend the stack".

A single-threaded program has a heap and a stack in a single address
space.  We put the heap at the bottom and the stack at the top and let
them grow towards each other.  Obviously this allows each the maximum
room for growth; we don't have to put a fixed division point into the
address space.

If the stack grows down into a page that the heap wasn't already using,
the OS can detect that and simply map physical memory to that part of
the program's address space. This form of "extending the stack" is easy.

Now, in a program with N threads, we have a heap and N stacks in a
single address space. We must create N-1 divisions in the address space.
(We could get away with floor((N-1)/2) divisions if we let half the
stacks grow towards lower addresses and the other half grow towards
higher addresses, but we'd still need O(N) divisions and the code would
be messy.) The heap and one stack can share one part of the address
space, but each of the other stacks needs its own part of the address
space:

Heap-->    <--StackN ... <--Stack3  <--Stack2  <--StackN

When thread 2 is created, only the address space at the top of stack
2 needs to be mapped to physical memory. As stack 2 grows, the memory
allocation can be extended as in the single-threaded program. But once
stack 2 exhausts its allocated address space, there's no where else to
go. A C stack cannot be moved, so we can't slide stack 3 down to make
more room.

So the question is, how much address space should we allocate for each
stack? I claim that any automatic method of choosing the allocation is
going to fail for some class of programs. The programmer must ultimately
take responsibility for choosing how much address space to allocate to
each thread. In the case of AOLserver, the programmer may be the guy
writing AOLserver, or it may be the guy writing the .tcl page that uses
a lot of memory.

One might argue that AOLserver's default allocation is wrong. But's a
much different claim.

OK, one more possibility occurs to me: you could allocate stack frames
on the heap. This has an entirely different set of ramifications. If
this is what you had in mind as a way to "automatically and efficiently
extend the stack", great, but I'll be surprised if it's what you were
using on the DEC-10 in 1979.

Reply via email to