On Dec 11, 11:00 am, Antoine Pitrou <solip...@pitrou.net> wrote: > I was going to suggest memcached but it probably serializes non-atomic > types. Atomic as well. memcached communicates through sockets[3] (albeit possibly unix sockets, which are faster than TCP ones).
multiprocessing has shared memory schemes, but does a lot of internal copying (uses ctypes)... and are particularly unhelpful when your shared data is highly structured, since you can't share objects, only primitive types. I finished a patch that pushes reference counters into packed pools. It has lots of drawbacks, but manages to solve this particular problem, if the data is prominently non-numeric (ie: lists and dicts, as mentioned before). Of the drawbacks, perhaps the bigger is a bigger memory footprint - yep... I don't believe there's anything that can be done to change that. It can be optimized, to make the overhead a little less though. This test code[1] consumes roughly 2G of RAM on an x86_64 with python 2.6.1, with the patch, it *should* use 2.3G of RAM (as specified by its output), so you can see the footprint overhead... but better page sharing makes it consume about 6 times less - roughly 400M... which is the size of the dataset. Ie: near-optimal data sharing. This patch[2] has other optimizations intermingled - if there's interest in the patch without those (which are both unproven and nonportable) I could try to separate them. I will have to, anyway, to upload for inclusion into CPython (if I manage to fix the shortcomings, and if it gets approved). The most important shortcomings of the refcount patch are: 1) Tripled memory overhead of reference counting. Before, it was a single Py_ssize_t per object. Now, it's two pointers plus the Py_ssize_t. This could perhaps be optimized (by getting rid of the arena pointer, for instance). 2) Increased code output for Py_INCREF/DECREF. It's small, but it adds up to a lot. Timings on test_decimal.py (a small numeric benchmark I use, which might not be representative at all) shows a 10% performance loss in CPU time. Again, this might be optimized with a lot of work and creativity. 3) Breaks binary compatibility, and in weird cases source compatibility with extension modules. PyObject layout is different, so statically-initialized variables need to stick to using CPython's macros (I've seen cases when they don't), and code should use Py_REFCNT () for accessing the refcount, but many just do ob->ob_refcnt, which will break with the patch. 4) I'm also not really sure (haven't tested) what happens when CPython runs out of memory - I tried real hard not to segfault, even recover nicely, but you know how hard that is... [3] http://code.google.com/p/memcached/wiki/FAQ#How_does_it_compare_to_a_server_local_cache?_(PHP%27s_APC,_mm [2] http://www.deeplayer.com/claudio/misc/Python-2.6.1-refcount.patch [1] test code below import time from multiprocessing import Pool def usoMemoria(): import os import subprocess pid = os.getpid() cmd = "ps -o vsz=,rss=,share= -p %s --ppid %s" % (pid,pid) p = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) info = p.stdout.readlines() s = sum( int(r) for v,r,s in map(str.split,map(str.strip, info)) ) return s def f(_): return sum(int(x) for d in huge_global_data for x in d if x != "state") # my sofisticated formula goes here if __name__ == '__main__': huge_global_data = [] for i in xrange(500000): d = {} d[str(i)] = str(i*10) d[str(i+1)] = str(i) d["state"] = 3 huge_global_data.append(d) p = Pool(7) res= list(p.map(f, xrange(20))) print "%.2fM" % (usoMemoria() / 1024.0) -- http://mail.python.org/mailman/listinfo/python-list