Hi, > Thanks you very much for the review and the patch! The patch "works" > in the sense that I can now build.
Ok, good. > I'll run with it for a while, put it in the next release candidate, > and see how it affects XEmacs's footprint. Probably it will go in the > next release of 21.4, safety over efficiency. But I would appreciate > it if you could help us recover the mmap capabilities. Yes, I'll give it a try when I get some time. > Wolfram> N.B. the exact same problem should/could show up with > Wolfram> earlier glibc releases, but maybe the allocation pattern > Wolfram> was slightly different. > > I don't understand the problem so I'm not sure what you're saying. > > First, your patch also affects "portable dumper" builds which build > and (mostly) run fine. Is this intentional? Ie, is this a generic > problem with our allocator implementation, which "just happened" to > manifest dramatically only in unexec builds on very recent glibcs? That's quite probable. You see, the problem is Lisp's tagged pointers. Pointers to Lisp objects are "coloured" in their most significant bits (I think 3 bits) with type information. Therefore, when the malloc implementation hands out chunks with one of those high bits already set, you get a clash. This can indeed happen with glibc's malloc on Linux, because it hands out mmapped chunks (they start near 0x40000000 on ix86-linux), but generally only for "large" allocations. (You had the M_MMAP_THRESHOLD set to 64k, a reasonable choice IMHO; one Lisp vector allocation from the "temacs -dump" run just exceeded that threshold.) By setting M_MMAP_MAX to 0 I've disabled all use of mmap; glibc's malloc behaves more like a classic malloc then. For GNU Emacs, I've added temporary switches of M_MMAP_MAX to 0 and back _only_ in the Lisp object allocation paths (there were about half a dozen places); I'll try to do the same for XEmacs when I find the time. > Second, if it's a generic problem, is it possible that it would > generate GCPRO-bug-like symptoms (ie, weird crashes in "obviously > correct" code because data that we know is correctly initialized > mysteriously changes)? For example, we fixed a couple of GCPRO bugs > recently, but we're still seeing mysterious "illegal bytecode" crashes > (especially in Gnus), although fewer of them :-). We're pretty sure > the bytecompiler isn't responsible for this, because we've checked the > code in memory. Not sure about this, but I suspect that "mysteriously changing" memory cannot occur due to this. The chance that a coloured pointer is masked (by stripping off the top bits) into a valid memory region seems quite small to me. But it's certainly not impossible. > Third, is there still a possible problem if we use > --with-system-malloc? Ie, we use the Doug Lea malloc from glibc, > but do no mallopt tweaking. Yes, there is the same problem, AFAICS. IMHO the autoconf test should check whether mallopt(M_MMAP_MAX) is available, and deduce that it's a variant of Doug Lea malloc. Regards, Wolfram. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

