Evan Jones writes: > My knowledge about garbage collection is weak, but I have read a little > bit of Hans Boehm's work on garbage collection. [...] The biggest > disadvantage mentioned is that simple pointer assignments end up > becoming "increment ref count" operations as well...
Hans Boehm certainly has some excellent points. I believe a little searching through the Python dev archives will reveal that attempts have been made in the past to use his GC tools with CPython, and that the results have been disapointing. That may be because other parts of CPython are optimized for reference counting, or it may be just because this stuff is so bloody difficult! However, remember that changing away from reference counting is a change to the semantics of CPython. Right now, people can (and often do) assume that objects which don't participate in a reference loop are collected as soon as they go out of scope. They write code that depends on this... idioms like: >>> text_of_file = open(file_name, 'r').read() Perhaps such idioms aren't a good practice (they'd fail in Jython or in IronPython), but they ARE common. So we shouldn't stop using reference counting unless we can demonstrate that the alternative is clearly better. Of course, we'd also need to devise a way for extensions to cooperate (which is a problem Jython, at least, doesn't face). So it's NOT an obvious call, and so far numerous attempts to review other GC strategies have failed. I wouldn't be so quick to dismiss reference counting. > My only argument for making Python capable of leveraging multiple > processor environments is that multithreading seems to be where the big > performance increases will be in the next few years. I am currently > using Python for some relatively large simulations, so performance is > important to me. CPython CAN leverage such environments, and it IS used that way. However, this requires using multiple Python processes and inter-process communication of some sort (there are lots of choices, take your pick). It's a technique which is more trouble for the programmer, but in my experience usually has less likelihood of containing subtle parallel processing bugs. Sure, it'd be great if Python threads could make use of separate CPUs, but if the cost of that were that Python dictionaries performed as poorly as a Java HashTable or synchronized HashMap, then it wouldn't be worth the cost. There's a reason why Java moved away from HashTable (the threadsafe data structure) to HashMap (not threadsafe). Perhaps the REAL solution is just a really good IPC library that makes it easier to write programs that launch "threads" as separate processes and communicate with them. No change to the internals, just a new library to encourage people to use the technique that already works. -- Michael Chermside _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com