> -----Original Message----- > From: Andrew Durdin [mailto:[EMAIL PROTECTED] > Sent: Friday, September 16, 2005 8:25 AM > To: [EMAIL PROTECTED] > Cc: python-dev@python.org > Subject: Re: [Python-Dev] C coding experiment > > On 9/1/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > The goal is to determine whether the setobject.c implementation would be > > improved by recoding the set_lookkey() function to optimize key > > insertion order using Brent's variation of Algorithm D (See Knuth vol. > > III, section 6.4, page 525). > > Brent's variation depends on the next probe position for a key being > derivable just from the key and its current position. The use of > perturbation in set_lookkey() prevents that, as we cannot say, given a > key at a certain position, where the next probe location for that key > would have been, without doing a complete lookup of that key > (obviously too expensive).
Right. For Brent's variation to work, I've had to replace perturbation with a secondary hash function with linear spacing. > It would be interesting to see whether Brent's variation or > perturbation give better expected overall performance -- the latter > may well prove better in most cases, as it should reduce the number of > probes needed for both successful and unsuccessful lookups, while > Brent's variation only speeds up successful lookups. Well, this sort > of question is what the experiment is meant to resolve, no? Right again. I'm also experimenting with a simpler approach -- whenever there are more than three probes, always swap the new key into the first position and then unconditionally re-insert the swapped-out key. Most of the time that gives an improvement and it doesn't require changing the perturbation logic -- it is equivalent to changing the key insertion order. The only implementation wrinkle is that the approach needs separate lookup and insert functions (only the latter does the swap). The simpler approach is cheap to implement as it doesn't slow-down the first three probes. OTOH, te benefits are smaller than with Brent's variation -- it only relieves the worst collisions. I'm still interested in it because it helps the two most important use cases for sets (uniquification and membership testing). Raymond _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com