Re: [Numpy-discussion] Reference count error detected bug appears with multithreading (OpenMP TBB)

2012-01-18 Thread Robert Kern
On Wed, Jan 18, 2012 at 14:59, Malcolm Reynolds
malcolm.reyno...@gmail.com wrote:
 Hi,

 I've built a system which allocates numpy arrays and processes them in
 C++ code (this is because I'm building a native code module using
 boost.python and it makes sense to use numpy data storage to then deal
 with outputs in python, without having to do any copying). Everything
 seems fine except when I parallelise the main loop, (openmp and TBB
 give the same results) in which case I see a whole bunch of messages
 saying

 reference count error detected: an attempt was made to deallocate 12 (d)

 sometimes during the running of the program, sometimes all at the end
 (presumably when all the destructors in my program run).

 To clarify, the loop I am now running parallel takes read-only
 parameters (enforced by the C++ compiler using 'const') and as far as
 I can tell there are no race conditions with multiple threads writing
 to the same numpy arrays at once or anything obvious like that.

 I recompiled numpy (I'm using 1.6.1 from the official git repository)
 to print out some extra information with the reference count message,
 namely a pointer to the thing which is being erroneously deallocated.
 Surprisingly, it is always the same address for any run of the
 program, considering this is a message printed out hundreds of times.

 I've looked into this a little with GDB and as far as I can see the
 object which the message pertains to is an array descriptor, or at
 least that's what I conclude from backtraces similar to the following:

 Breakpoint 1, arraydescr_dealloc (self=0x1029083a0) at descriptor.c:1501
 1501            fprintf(stderr, *** Reference count error detected: \n \
 (gdb) bt
 #0  arraydescr_dealloc (self=0x1029083a0) at descriptor.c:1501
 #1  0x000102897fc4 in array_dealloc (self=0x117e0e850) at 
 arrayobject.c:271
 #2  0x000103e592d7 in
 boost::detail::sp_counted_impl_pgarf::multivariate_normaldouble
 const::dispose (this=value temporarily unavailable, due to
 optimizations) at refcount.hpp:36
 #3  my code

I suspect there is some problem with the reference counting that you
are doing at the C++ level that is causing you to do too many
Py_DECREFs to the numpy objects, and this is being identified by the
arraydescr_dealloc() routine. (By the way, arraydescrs are the C-level
implementation of dtype objects.) Reading the comments just before
descriptor.c:1501 points out that this warning is being printed
because something is trying to deallocate the builtin np.dtype('d') ==
np.dtype('float64') dtype. This should never happen. The refcount for
these objects should always be  0 because numpy itself holds
references to them.

I suspect that you are obtaining the numpy object (1 Py_INCREF) before
you split into multiple threads but releasing them in each thread
(multiple Py_DECREFs). This is probably being hidden from you by the
boost.python interface and/or the boost::detail::sp_counted_impl_p
smart(ish) pointer. Check the backtrace where your code starts to
verify if this looks to be the case.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Reference count error detected bug appears with multithreading (OpenMP TBB)

2012-01-18 Thread Malcolm Reynolds

 I suspect that you are obtaining the numpy object (1 Py_INCREF) before
 you split into multiple threads but releasing them in each thread
 (multiple Py_DECREFs). This is probably being hidden from you by the
 boost.python interface and/or the boost::detail::sp_counted_impl_p
 smart(ish) pointer. Check the backtrace where your code starts to
 verify if this looks to be the case.

Thankyou for your quick reply. This makes a lot of sense, I'm just
having trouble seeing where this could be happening as everything I
pass into each parallel computation strand is pass down as either
pointer-to-consts or reference-to-const - the only things that need to
be modified (for example random number generator objects) are created
uniquely inside each iteration of the for loop so it can't be that.

This information about which object has the reference count problem
helps though, I will keep digging. I'm vaguely planning on trying to
track every incref and decref so I can pin down which object has an
unbalanced amount - to do this I want to know the address of the
array, rather than the associated datatype descriptor - I assume I
want to pay attention to the (self=0x117e0e850) in this line, and that
is the address of the array I am mishandling?

#1  0x000102897fc4 in array_dealloc (self=0x117e0e850) at arrayobject.c:271

Malcolm


 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a harmless
 enigma that is made terrible by our own mad attempt to interpret it as
 though it had an underlying truth.
   -- Umberto Eco
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Reference count error detected bug appears with multithreading (OpenMP TBB)

2012-01-18 Thread Robert Kern
On Wed, Jan 18, 2012 at 16:14, Malcolm Reynolds
malcolm.reyno...@gmail.com wrote:

 I suspect that you are obtaining the numpy object (1 Py_INCREF) before
 you split into multiple threads but releasing them in each thread
 (multiple Py_DECREFs). This is probably being hidden from you by the
 boost.python interface and/or the boost::detail::sp_counted_impl_p
 smart(ish) pointer. Check the backtrace where your code starts to
 verify if this looks to be the case.

 Thankyou for your quick reply. This makes a lot of sense, I'm just
 having trouble seeing where this could be happening as everything I
 pass into each parallel computation strand is pass down as either
 pointer-to-consts or reference-to-const - the only things that need to
 be modified (for example random number generator objects) are created
 uniquely inside each iteration of the for loop so it can't be that.

My C++-fu is fairly weak, so I'm never really sure what the smart
pointers are doing when. If there are tracing features that you can
turn on, try that. Is this deallocation of the smart pointer to the
garf::multivariate_normaldouble const being done inside the loop
or outside back in the main thread? Where did it get created?

 This information about which object has the reference count problem
 helps though, I will keep digging. I'm vaguely planning on trying to
 track every incref and decref so I can pin down which object has an
 unbalanced amount - to do this I want to know the address of the
 array, rather than the associated datatype descriptor - I assume I
 want to pay attention to the (self=0x117e0e850) in this line, and that
 is the address of the array I am mishandling?

 #1  0x000102897fc4 in array_dealloc (self=0x117e0e850) at 
 arrayobject.c:271

Yes.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion