Hi Anthony,
Thank you very much for your answer (it works). I will try to remodel my
code around this trick but I'm not sure it's possible because I use a
framework that need arrays.
Can somebody explain what is going on? I was thinking that PyTables keep
weakref to the file for lazy loading but I'm not sure.
How
In any case, the PyTables community is very helpful.
Thanks,
Mathieu
Le 12/07/2013 00:44, Anthony Scopatz a écrit :
Hi Mathieu,
I think you should try opening a new file handle per process. The
following works for me on v3.0:
import tables
import random
import multiprocessing
# Reload the data
# Use multiprocessing to perform a simple computation (column average)
def f(filename):
h5file = tables.openFile(filename, mode='r')
name = multiprocessing.current_process().name
column = random.randint(0, 10)
print '%s use column %i' % (name, column)
rtn = h5file.root.X[:, column].mean()
h5file.close()
return rtn
p = multiprocessing.Pool(2)
col_mean = p.map(f, ['test.hdf5', 'test.hdf5', 'test.hdf5'])
Be well
Anthony
On Thu, Jul 11, 2013 at 3:43 PM, Mathieu Dubois
<duboismathieu_g...@yahoo.fr <mailto:duboismathieu_g...@yahoo.fr>> wrote:
Le 11/07/2013 21:56, Anthony Scopatz a écrit :
On Thu, Jul 11, 2013 at 2:49 PM, Mathieu Dubois
<duboismathieu_g...@yahoo.fr
<mailto:duboismathieu_g...@yahoo.fr>> wrote:
Hello,
I wanted to use PyTables in conjunction with multiprocessing
for some
embarrassingly parallel tasks.
However, it seems that it is not possible. In the following (very
stupid) example, X is a Carray of size (100, 10) stored in
the file
test.hdf5:
import tables
import multiprocessing
# Reload the data
h5file = tables.openFile('test.hdf5', mode='r')
X = h5file.root.X
# Use multiprocessing to perform a simple computation (column
average)
def f(X):
name = multiprocessing.current_process().name
column = random.randint(0, n_features)
print '%s use column %i' % (name, column)
return X[:, column].mean()
p = multiprocessing.Pool(2)
col_mean = p.map(f, [X, X, X])
When executing it the following error:
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in
__bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line
319, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'weakref'>: attribute
lookup __builtin__.weakref failed
I have googled for weakref and pickle but can't find a solution.
Any help?
Hello Mathieu,
I have used multiprocessing and files opened in read mode many
times so I am not sure what is going on here.
Thanks for your answer. Maybe you can point me to an working example?
Could you provide the test.hdf5 file so that we could try to
reproduce this.
Here is the script that I have used to generate the data:
import tables
import numpy
# Create data & store it
n_features = 10
n_obs = 100
X = numpy.random.rand(n_obs, n_features)
h5file = tables.openFile('test.hdf5', mode='w')
Xatom = tables.Atom.from_dtype(X.dtype)
Xhdf5 = h5file.createCArray(h5file.root, 'X', Xatom, X.shape)
Xhdf5[:] = X
h5file.close()
I hope it's not a stupid mistake. I am using PyTables 2.3.1 on
Ubuntu 12.04 (libhdf5 is 1.8.4patch1).
By the way, I have noticed that by slicing a Carray, I get a
numpy array
(I created the HDF5 file with numpy). Therefore, everything
is copied to
memory. Is there a way to avoid that?
Only the slice that you ask for is brought into memory an it is
returned as a non-view numpy array.
OK. I may be careful about that.
Be Well
Anthony
Mathieu
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from
AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
<mailto:Pytables-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/pytables-users
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
<mailto:Pytables-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/pytables-users
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
<mailto:Pytables-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/pytables-users
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users