On Mon, Oct 8, 2012 at 5:13 AM, Owen Mackwood
<owen.mackw...@bccn-berlin.de>wrote:

> Hi Anthony,
>
> There is a single multiprocessing.Pool which usually has 6-8 processes,
> each of which is used to run a single task, after which a new process is
> created for the next task (maxtasksperchild=1 for the Pool constructor).
> There is a master process that regularly opens an HDF5 file to read out
> information for the worker processes (data that gets copied into a
> dictionary and passed as args to the worker's target function). There are
> no problems with the master process, it never hangs.
>

Hello Owen,

Hmmm, Are you actually copying the data (f.root.data[:])  or are you simply
passing a reference as arguments (f.root.data)?


> The failure appears to be random, affecting less than 2% of my tasks (all
> tasks are highly similar and should call the same tables functions in the
> same order). This is running on Debian Squeeze, Python 2.7.3, PyTables
> 2.4.0. As far as the particular function that hangs... tough to say since I
> haven't yet been able to properly debug the issue. The interpreter hangs
> which limits my ability to diagnose the source of the problem. I call a
> number of functions in the tables module from the worker process, including
> openFile, createVLArray, createCArray, createGroup, flush, and of course
> close.
>

So if you are opening a file in the master process and then
writing/creating/flushing from the workers this may cause a problem.
 Multiprocess creates a fork of the original process so you are relying on
the file handle from the master process to not accidentally change somehow.
 Can you try to open the files in the workers rather than the master?  I
hope that this clears up the issue.

Basically, I am advocating a more conservative approach where all data that
is read or written to in a worker must come from that worker, rather than
being generated by the master.  If you are *still* experiencing these
problems, then we know we have a real problem.

Also if this doesn't fix it, if you could send us a small sample module
which reproduces this issue, that would be great too!

Be Well
Anthony


>
> I'll continue to try and find out more about when and how the hang occurs.
> I have to rebuild Python to allow the gdb pystack macro to work. If you
> have any suggestions for me, I'd love to hear them.
>
> Regards,
> Owen
>
>
> On 7 October 2012 00:28, Anthony Scopatz <scop...@gmail.com> wrote:
>
>> Hi Owen,
>>
>> How many pools do you have?  Is this a random runtime failure?  What kind
>> of system is this one?  Is there some particular fucntion in Python that
>> you are running?  (It seems to be openFile(), but I can't be sure...)  The
>> error is definitely happening down in the H5open() routine.  Now whether
>> this is HDF5's fault or ours, I am not yet sure.
>>
>> Be Well
>> Anthony
>>
>>
>
>
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to