[Bob Ippolito] > ... > Your expectation is not correct for Darwin's memory allocation scheme. > It seems that Darwin creates allocations of immutable size. The only > way ANY part of an allocation will ever be used by ANYTHING else is if > free() is called with that allocation.
Ya, I understood that. My conclusion was that Darwin's realloc() implementation isn't production-quality. So it goes. > free() can be called either explicitly, or implicitly by calling realloc() > with > a size larger than the size of the allocation. In that case, it will create > a new > allocation of at least the requested size, copy the contents of the > original allocation into the new allocation (probably with > copy-on-write pages if it's large enough, so it might be cheap), and > free() the allocation. Really? Another near-universal "quality of implementation" expectation is that a growing realloc() will strive to extend in-place. Like realloc(malloc(1000000), 1000001). For example, the theoretical guarantee that one-at-a-time list.append() has amortized linear time doesn't depend on that, but pragmatically it's greatly helped by a reasonable growing realloc() implementation. > In the case where realloc() specifies a size that is not greater than the > allocation's size, it will simply return the given allocation and cause no > side- > effects whatsoever. > > Was this a good decision? Probably not! Sounds more like a bug (or two) to me than "a decision", but I don't know. > However, it is our (in the "I know you use Windows but I am not the only > one that uses Mac OS X sense) problem so long as Darwin is a supported > platform, because it is highly unlikely that Apple will backport any "fix" to > the allocator unless we can prove it has some security implications in > software shipped with their OS. ... Is there any known case where Python performs poorly on this OS, for this reason, other than the "pass giant numbers to recv() and then shrink the string because we didn't get anywhere near that many bytes" case? Claiming rampant performance problems should require evidence too <wink>. ... > Presumably this can happen at other places (including third party > extensions), so a better place to do this might be _PyString_Resize(). > list_resize() is another reasonable place to put this. I'm sure there > are other places that use realloc() too, and the majority of them do > this through obmalloc. So maybe instead of trying to track down all > the places where this can manifest, we should just "gunk up" Python and > patch PyObject_Realloc()? There is no "choke point" for allocations in Python -- some places call the system realloc() directly. Maybe the latter matter on Darwin too, but maybe they don't. The scope of this hack spreads if they do. I have no idea how often realloc() is called directly by 3rd-party extension modules. It's called directly a lot in Zope's C code, but AFAICT only to grow vectors, never to shrink them. ' > Since we are both pretty confident that other allocators aren't like Darwin, > this "gunk" can be #ifdef'ed to the __APPLE__ case. #ifdef's are a last resort: they almost never go away, so they complicate the code forever after, and typically stick around for years even after the platform problems they intended to address have been fixed. For obvious reasons, they're also an endless source of platform-specific bugs. Note that pymalloc already does a memcpy+free when in PyObject_Realloc(p, n) p was obtained from the system malloc or realloc but n is small enough to meet the "small object" threshold (pymalloc "takes over" small blocks that result from a PyObject_Realloc()). That's a reasonable strategy *because* n is always small in such cases. If you're going to extend this strategy to n of arbitrary size, then you may also create new performance problems for some apps on Darwin (copying n bytes can get arbitrarily expensive). > ... > I'm sure I'll find something, but what's important to me is that Python > works well on Mac OS X, so something should happen. I agree the socket-abuse case should be fiddled, and for more reasons than just Darwin's realloc() quirks. I don't know that there are actual problems on Darwin broader than that case (and I'm not challenging you to contrive one, I'm asking whether realloc() quirks are suspected in any other case that's known). Part of what you demonstrated when you said that pystone didn't slow down when you fiddled stuff is that pystone also didn't speed up. I also don't know that the memcpy+free wormaround is actually going to help more than it hurts overall. Yes, in the socket-abuse case, where the program routinely malloc()s strings millions of bytes larger than the socket can deliver, it would obviously help. That's not typically program behavior (however typical it may be of that specific app). More typical is shrinking a long list one element at a time, in which case about half the list remaining would get memcpy'd from time to time where such copies never get made today. IOW, there's no straightforward pure win here. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com