I put the stack comments here
> > In the end, though, I think it's also important to remember that the use > case for automatically growable stacks doesn't arise that frequently. > Im firmly in the 1 big stack and tune it down camp ( its simple ,known and is the status quo) and add growable stacks as a feature ( preferably copy ) . The large amount of threads is driven by the actor model ( and a few others) but i think its a bit miss guided and their are other ways like mutliplexing actor/ thread pumped services on less system threads instead of dealling with hundreds of threads. > There are a whole bunch of ways to do it, and several places it can go. > Which one is right can only be determined by measurement. > > The real issue is that in an optimizing implementation, you may end up > with multiple stack maps for a single frame, because locations on the stack > may be used for integers at one moment and object references the next. You > already need some sort of lookup of the form (base PC, bound PC, stack map > addr), but you'll get a lot more entries if the shape of the frame is > changing. > Wouldnt it be better to burn the stack memory and reserve slots for each potential use ? Variable size stacks dont mix well with Immix Blocks and method metadata. Also the Metadata map would need to be made after inlining ( or inlining smart enough to update the metadata) which is quite low level. On Wed, Nov 20, 2013 at 1:36 AM, Jonathan S. Shapiro <[email protected]>wrote: > Oh. I forgot a whole topic in my last: stack copying. > > The performance killer in a segmented stack isn't the check. It's the fact > that many call sites (or equivalently, entry sites) now need to be prepared > to shuffle their own frame onto a different call stack. That's both a lot > of code and a lot of overhead. The only real way to avoid this is to > relocate the stack. > I know these costs and the cache costs of using a new memory region, but what i dont get ( and which maybe misleading me ) was in the paper mentioned before how size analysis of the worst case stack path significantly reduced the cost . Maybe they could specify a stack segment that covered the worst case path and hence reduce these costs ( eg 7K for some worker threads may be statically determined) their use case of 100's of threads is unusual so in effect the analysis was used to reduce segments rather than reduce the cost of the check.. > > One of the nice things about a precise stack map combined with a precise > register map is that the stack can be copied. So the goal here shouldn't be > stitching stack segments together. It should be *growing* the stack up to > some thread-defined guard size (above which a stack overflow is really an > error). > > Unfortunately, there may be C code on the stack. That can't be moved, > because we don't know what it does. When a stack relocation occurs while C > stack frames are present, what it needs to do is leave a return trampoline > at the top of the new stack that returns to the C frame on the old stack. > We then leave a marker at the *top* of the C frames to indicate that we > should migrate the stuff *above* the C frames onto the larger stack frame > when the C code has returned. > > Note that the call from C to managed code can arrange to always perform > the code on the correct (newer and larger) stack. > > Also note that during the "copy on return from C" code, the new stack > region is known to be completely empty, just as it was when we did the > earlier stack relocation. Offhand, I provisionally believe that whenever a > stack relocation occurs, it is always the case that frames are being > migrated into an empty chunk. > Badly behaved C is a pain but you are right you can leave the old stack there and handle it , at the cost of wasting memory . Ben
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
