Hi Owen,
So just to confirm this behavior, having run your sample on a couple of my
machines, what you see is that the code looks like it gets all the way to
the end, and then it stalls right before it is about to exit, leaving some
small number of processes (here names python tables_test.py) in the OS. Is
this correct?
It seems to be the case that these failures do not happen when I set the
processor pool size to be less than or equal to the number of processors
(physical or hyperthreaded) that I have on the machine. I was testing this
both on an 32 proc cluster and my dual core laptop. Is this also
the behavior you have seen?
Be Well
Anthony
On Tue, Oct 9, 2012 at 8:08 AM, Owen Mackwood
<owen.mackw...@bccn-berlin.de>wrote:
> Hi Anthony,
>
> I've created a reduced example which reproduces the error. I suppose the
> more processes you can run in parallel the more likely it is you'll see the
> hang. On a machine with 8 cores, I see 5-6 processes hang out of 2000.
>
> All of the hung tasks had a call stack that looked like this:
>
> #0 0x00007fc8ecfd01fc in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/libpthread.so.0
> #1 0x00007fc8ebd9d215 in H5TS_mutex_lock () from /usr/lib/libhdf5.so.6
> #2 0x00007fc8ebaacff0 in H5open () from /usr/lib/libhdf5.so.6
> #3 0x00007fc8e224c6a4 in __pyx_pf_6tables_13hdf5Extension_4File__g_new
> (__pyx_v_self=0x28b35a0, __pyx_args=<value optimized out>,
> __pyx_kwds=<value optimized out>) at tables/hdf5Extension.c:2820
> #4 0x00000000004abf62 in ext_do_call (f=0x271f4c0, throwflag=<value
> optimized out>) at Python/ceval.c:4331
> #5 PyEval_EvalFrameEx (f=0x271f4c0, throwflag=<value optimized out>) at
> Python/ceval.c:2705
> #6 0x00000000004ada51 in PyEval_EvalCodeEx (co=0x247aeb0, globals=<value
> optimized out>, locals=<value optimized out>, args=0x288cea0, argcount=0,
> kws=<value optimized out>, kwcount=0,
> defs=0x25ffd78, defcount=4, closure=0x0) at Python/ceval.c:3253
>
> I've attached the code to reproduce this. It probably isn't quite minimal,
> but it is reasonably simple (and stereotypical of the kind of operations I
> use). Let me know if you need anything else, or have questions about my
> code.
>
> Regards,
> Owen
>
>
>
> On 8 October 2012 17:37, Anthony Scopatz <scop...@gmail.com> wrote:
>
>> Hello Owen,
>>
>> So __getitem__() calls read() on the items it needs. Both should return
>> a copy in-memory of the data that is on disk.
>>
>> Frankly, I am not really sure what is going on, given what you have said.
>> A minimal example which reproduces the error would be really helpful.
>> From the error that you have provided, though, the only thing that I can
>> think of is that it is related to file opening on the worker processes.
>>
>> Be Well
>> Anthony
>>
>>
>
>
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users