I have started doing practice creating C extensions for CPython, so here are two ideas I have had, possibly useless.
If you keep adding elements to a CPython dict/set, it periodically rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods may help, reserving enough (empty) memory for about n *distinct* keys the programmer wants to add to the dict/set in a short future. I have seen that the the C API of the dicts doesn't allow this, and I don't know if this can be implemented modifying the dicts a bit. Do you think this may be useful? ------------------------------- Sometime I need to remove some elements from dicts/sets. If I know a rule that tells what elements have to be removed I can use code like this: import itertools def filterset(predicate, inset): return set(itertools.ifilter(predicate, inset)) print filterset(lambda x: x & 1, set(range(10))) Output: set([1, 3, 9, 5, 7]) And a similar one to filter dicts that works with a predicate(key, val). Most of the times such functions are good enough, but sometimes the dicts are big, so to reduce memory used I remove keys in place: def filterdict(pred, indict): todel = [k for k,v in indict.iteritems() if not pred(k,v)] for key in todel: del indict[key] d = dict.fromkeys(xrange(8), 0) print d filterdict(lambda k,v: k & 1, d) print d # Output: # {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0} # {1: 0, 3: 0, 5: 0, 7: 0} But doing the same thing while iterating on the dict may be faster and use even less memory. This iteration&deletion capability is probably not safe enough to be used inside Python programs, but a compiled C module with a function that works like that filterdict (and deletes while iterating) may be created, and its use is safe from Python programs. The C++ STL API of maps allows to delete items while iterating on the map (but probably it has to be done only when necessary, because it may be a source for bugs): typedef map<int, string> isMap; typedef isMap::value_type isValType; typedef isMap::iterator isMapItor; void main() { isMap m; ... // items added to m for(isMapItor itr = m.begin(); itr != m.end();) { if(itr->second == "somestring") m.erase(itr++); else ++itr; } The Python/C API, 7.4.1 Dictionary Objects, says that's not possible with CPython dicts: >The dictionary p should not be mutated during iteration. It is safe (since >Python 2.1) to modify the values of the keys as you iterate over the >dictionary, but only so long as the set of keys does not change.< Do you think it may be useful to create to create such C filterdict function that deletes while iterating? (To create such function it probably has to bypass the CPython dict API). Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list