A Wednesday 03 November 2010 23:59:38 Marko Budisic escrigué:
> Dear all,
> 
> I am having some trouble with using pytables correctly, and I was
> hoping for some guidance. I would like to have one central pytables
> file, containing a VLArray that would be used by several "worker"
> processes. Each process should perform some computation, and append
> it as a new row to VLArray. Due to possible sizes of results, it
> would be difficult to pass results to the main thread for it to
> store into pytables file.
[clip]

What you are trying to achieve is tricky, but fortunately, possible.  
First, in order to avoid problems with internal caches, you need to 
lock, open, save and close for *each* worker.  You are not doing this 
currently.

Then, you need to respect the "lock, open, save and close" order if you 
want to ensure that everything goes well.  This example should 
illustrate the proper sequence:

#!/usr/bin/env python

from multiprocessing import Pool
import fcntl
import numpy
import tables
import os

def work(i):
    x = numpy.random.random((6,5000))
    group = '/group%d/group%d' % (i, i)
    dataset = 'dataset%d' % i
    fhandle = os.open('/tmp/output.h5', os.O_RDWR)
    fcntl.lockf(fhandle, fcntl.LOCK_EX)
    f = tables.openFile('/tmp/output.h5','a')
    # moving lockf here instead will cause crashes!
    arr = f.createArray(group, dataset, x, createparents=True)
    f.close()
    os.close(fhandle)

def main():
    tables.openFile('/tmp/output.h5','w').close()
    pool = Pool(processes=8)
    pool.map(work, range(5000), chunksize=1)

if __name__ == '__main__':
    main()

[please note the use of lockf over an opened filehandle]

Third, you will need at least PyTables 2.2 in order the above to work.

You can get more info on this in:

http://pytables.org/trac/ticket/185

Hope this helps,

-- 
Francesc Alted

------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to