I'll do the easy but ugly top posting thing.... I understand your model, and as usual you have explained your model very clearly. But I think there are some assumptions in your model that need to be addressed.
Your argument is basically that it's hard to allocate multiple stacks in a single address space environment. I agree with that. But that's not the problem. The problem I think is detecting stack overflows in the first place. Your model appears to be based on a MC68000 architecture, with no memory management unit. One flat address space for everyone to play in. And perhaps as an implementation goal of Tcl, this is the model they choose to program to. I can certainly understand that. It's simple, easy to debug -- compelling. And yes, much of the world of C/Unix was developed for such a flat space processor. But there is no real need for multi-threaded programs to only have a single address space. I think we all want the multiple threads to be able to access any of the structures or code with another thread, but that still doesn't require a single address space. But again, the problem is not allocation of stacks, but detection of stack overflow. Yeah, Multics and the 80x86 had segments. Multics was great, it had paged segments. So I would bet if one could write to a segmented architecture, it would be natural to place each stack in its own segment, and let the mmu detect stack overflows. And yeah, in the early smaller intels, we all got annoyed with having to load up various registers with appropriate segment pointers to differentiate code and data. Even in a flat architecture, in a machine with an mmu, all of this ill memory access should be detectable. As I mentioned on the DEC-10, and I am pretty sure on VAX/VMS you could arbitrarily protect specific pages against writes AND reads. Any thread trying to read or write into a page generates a fault. I would think it would be a good idea to separate each of those stack frames with a one or more protected pages. It might not solve the problem of one thread still corrupting another thread's memory or stack, but assuming a page was larger than several stack frames, it would certainly catch most typical stack overflows. Wouldn't it? (Do mmus only care about interprocess access and cannot catch intraprocess accesses?) But again, maybe I'm missing something here. When I was writing compilers for some small functional languages, I knew, or rather the compiler knew at every subroutine push, just how large the new stack frame had to be. It would seem to be pretty easy math, and not too harsh of an efficiency hit to have the compiler/interpreter check available stack size at each function entry. And at runtime if knew variables are consed onto the stack, again it should be simple during initial allocation of the variable by an interpreter to check the stack size. So what am I missing? I still don't understand why it isn't relatively easy to know when the stack is exhausted. Forget stack extension. I will be happy with a known error returning to the programmer: stack overflow. We could catch that one. We don't even get that. We get: <cracked windshield, brown screen, > game over dude, segv. That's unsafe! I would bet that if one can detect stack overflows reliably, then one can also implement an algorithm to safely extend the stack. It might have other performance impacts though. To sum up, I still do not see any technical reason for not being able to detect stack overflow other than management goals in the implementation of Tcl and the one possibly technical goal of trading off a level of safety vs. a level of performance. But hell, I know little about this, so please do clue me in. Jerry Rob Mayoff wrote:
+---------- On Feb 11, Jerry Asher said: >Are you really saying you cannot imagine how in a multithreaded >environment one can automatically and efficiently extend the stack? Depends on what you mean by "extend the stack". A single-threaded program has a heap and a stack in a single address space. We put the heap at the bottom and the stack at the top and let them grow towards each other. Obviously this allows each the maximum room for growth; we don't have to put a fixed division point into the address space. If the stack grows down into a page that the heap wasn't already using, the OS can detect that and simply map physical memory to that part of the program's address space. This form of "extending the stack" is easy. Now, in a program with N threads, we have a heap and N stacks in a single address space. We must create N-1 divisions in the address space. (We could get away with floor((N-1)/2) divisions if we let half the stacks grow towards lower addresses and the other half grow towards higher addresses, but we'd still need O(N) divisions and the code would be messy.) The heap and one stack can share one part of the address space, but each of the other stacks needs its own part of the address space: Heap--> <--StackN ... <--Stack3 <--Stack2 <--StackN When thread 2 is created, only the address space at the top of stack 2 needs to be mapped to physical memory. As stack 2 grows, the memory allocation can be extended as in the single-threaded program. But once stack 2 exhausts its allocated address space, there's no where else to go. A C stack cannot be moved, so we can't slide stack 3 down to make more room. So the question is, how much address space should we allocate for each stack? I claim that any automatic method of choosing the allocation is going to fail for some class of programs. The programmer must ultimately take responsibility for choosing how much address space to allocate to each thread. In the case of AOLserver, the programmer may be the guy writing AOLserver, or it may be the guy writing the .tcl page that uses a lot of memory. One might argue that AOLserver's default allocation is wrong. But's a much different claim. OK, one more possibility occurs to me: you could allocate stack frames on the heap. This has an entirely different set of ramifications. If this is what you had in mind as a way to "automatically and efficiently extend the stack", great, but I'll be surprised if it's what you were using on the DEC-10 in 1979.
