Dan Schult <dsch...@colgate.edu> added the comment: Benchmarks: Upon trying cooked up examples, I do not notice any speedup beyond 5-10%. Seems like function calling time swamps everything for small examples with fast hashes. I don't have a handy dandy example with long hash times or long lookup times. That's what it would take to show a large performance boost with this patch.
I also agree with Martin that there are many reasons not to use setdefault... But it is part of the API. We might as well make it worth using. (Which probably means changing the default value to a factory function which gets called when the key is not found. But that's a much bigger change...) I'm just suggesting that the code should not do extra work. By the way, defaultdict is NOT like setdefault--it is like get(). Missing entries do no get set. Perhaps there should be a collections entry that does setdefault instead of get...... Next, the second hash gets executed "only the first time" for EACH key. So, e.g. if you have a lot of entries that get called up 2 or 3 times, using the second hash does make a difference1/2 to 1/3 of the time. And we don't need a second hash or lookup so why do it. I understand Raymond's concern about more code using the data structure directly. There are three basic routines to deal with the data structure: ma_lookup/lookdict, insertdict and resizedict. The comments for lookdict encourage you to use the "ep" entry to check if it is empty and add the key/value pair if desired. But as currently implemented, insertdict calls lookdict, so they aren't atomistic in that sense. If atomism is a design goal (even if it isn't a word :) then insertdict would take "ep" as an input instead of doing a lookup. I'm not sure if atomism is part of the design in Python though... From my perspective creating an internal SetItem adds another function handling the data structure just as setdefault would--that's why I inlined in setdefault. But, it does keep the name similar and thus its easier to identify it as writing to the data structure. If this style is promoted here then I think there ought to be an internal insertdict that doesn't do the lookup. Also shouldn't the parent functions call these internal versions instead of duplicating code? Ack---that means more changes. Not difficult ones though... What do you think? Dan ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5730> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com