Francesc, thank you very much for your speedy and well explained response!
I modified the mock-up script I sent originally according to your guidelines (lock, open, save and close for each workers) and it seems to be working fine. I hope to translate the solution to my real problem successfully as well. Best, Marko On Nov 4, 2010, at 10:03 AM, Francesc Alted wrote: > A Wednesday 03 November 2010 23:59:38 Marko Budisic escrigué: >> Dear all, >> >> I am having some trouble with using pytables correctly, and I was >> hoping for some guidance. I would like to have one central pytables >> file, containing a VLArray that would be used by several "worker" >> processes. Each process should perform some computation, and append >> it as a new row to VLArray. Due to possible sizes of results, it >> would be difficult to pass results to the main thread for it to >> store into pytables file. > [clip] > > What you are trying to achieve is tricky, but fortunately, possible. > First, in order to avoid problems with internal caches, you need to > lock, open, save and close for *each* worker. You are not doing this > currently. > > Then, you need to respect the "lock, open, save and close" order if you > want to ensure that everything goes well. This example should > illustrate the proper sequence: > > #!/usr/bin/env python > > from multiprocessing import Pool > import fcntl > import numpy > import tables > import os > > def work(i): > x = numpy.random.random((6,5000)) > group = '/group%d/group%d' % (i, i) > dataset = 'dataset%d' % i > fhandle = os.open('/tmp/output.h5', os.O_RDWR) > fcntl.lockf(fhandle, fcntl.LOCK_EX) > f = tables.openFile('/tmp/output.h5','a') > # moving lockf here instead will cause crashes! > arr = f.createArray(group, dataset, x, createparents=True) > f.close() > os.close(fhandle) > > def main(): > tables.openFile('/tmp/output.h5','w').close() > pool = Pool(processes=8) > pool.map(work, range(5000), chunksize=1) > > if __name__ == '__main__': > main() > > [please note the use of lockf over an opened filehandle] > > Third, you will need at least PyTables 2.2 in order the above to work. > > You can get more info on this in: > > http://pytables.org/trac/ticket/185 > > Hope this helps, > > -- > Francesc Alted > > ------------------------------------------------------------------------------ > The Next 800 Companies to Lead America's Growth: New Video Whitepaper > David G. Thomson, author of the best-selling book "Blueprint to a > Billion" shares his insights and actions to help propel your > business during the next growth cycle. Listen Now! > http://p.sf.net/sfu/SAP-dev2dev > _______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users ------------------------------------------------------------------------------ The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users