I have started doing practice creating C extensions for CPython, so
here are two ideas I have had, possibly useless.

If you keep adding elements to a CPython dict/set, it periodically
rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods
may help, reserving enough (empty) memory for about n *distinct* keys
the programmer wants to add to the dict/set in a short future. I have
seen that the the C API of the dicts doesn't allow this, and I don't
know if this can be implemented modifying the dicts a bit. Do you think
this may be useful?

-------------------------------

Sometime I need to remove some elements from dicts/sets. If I know a
rule that tells what elements have to be removed I can use code like
this:

import itertools

def filterset(predicate, inset):
  return set(itertools.ifilter(predicate, inset))

print filterset(lambda x: x & 1, set(range(10)))

Output:
set([1, 3, 9, 5, 7])

And a similar one to filter dicts that works with a predicate(key,
val).
Most of the times such functions are good enough, but sometimes the
dicts are big, so to reduce memory used I remove keys in place:

def filterdict(pred, indict):
  todel = [k for k,v in indict.iteritems() if not pred(k,v)]
  for key in todel:
    del indict[key]

d = dict.fromkeys(xrange(8), 0)
print d
filterdict(lambda k,v: k & 1, d)
print d

# Output:
# {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0}
# {1: 0, 3: 0, 5: 0, 7: 0}


But doing the same thing while iterating on the dict may be faster and
use even less memory.
This iteration&deletion capability is probably not safe enough to be
used inside Python programs, but a compiled C module with a function
that works like that filterdict (and deletes while iterating) may be
created, and its use is safe from Python programs.

The C++ STL API of maps allows to delete items while iterating on the
map (but probably it has to be done only when necessary, because it may
be a source for bugs):

typedef map<int, string> isMap;
typedef isMap::value_type isValType;
typedef isMap::iterator isMapItor;

void main() {
  isMap m;

  ... // items added to m

  for(isMapItor itr = m.begin(); itr != m.end();) {
    if(itr->second == "somestring")
      m.erase(itr++);
    else
      ++itr;
  }


The Python/C API, 7.4.1 Dictionary Objects, says that's not possible
with CPython dicts:

>The dictionary p should not be mutated during iteration. It is safe (since 
>Python 2.1) to modify the values of the keys as you iterate over the 
>dictionary, but only so long as the set of keys does not change.<

Do you think it may be useful to create to create such C filterdict
function that deletes while iterating? (To create such function it
probably has to bypass the CPython dict API).

Bye,
bearophile

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to