hey, sorry for not replying to the mailing list I thought I was doing that
I move the initialization outside of the threaded part and it is still not working, but I get a different error. I cannot get a fenics error again but in stead just segmentation fault. The debugger error points to GenericBoundingBoxTree::build but I don't know what happened there. Full debugger message is: 0xb7ce29b1 in dolfin::GenericBoundingBoxTree::_build(std::vector<dolfin::Point, std::allocator<dolfin::Point> > const&, __gnu_cxx::__normal_iterator<unsigned int*, std::vector<unsigned int, std::allocator<unsigned int> > > const&, __gnu_cxx::__normal_iterator<unsigned int*, std::vector<unsigned int, std::allocator<unsigned int> > > const&, unsigned int) () from /usr/lib/i386-linux-gnu/libdolfin.so.1.5 Thanks again for the help ________________________________________ From: Garth N. Wells [[email protected]] Sent: 25 August 2015 16:44 To: Alvaro Diez Gonzalez-Pardo; [email protected] Subject: Re: [FEniCS] Multi-thread issue when using dolfin Function Please reply to the mailing list. On 25 August 2015 at 15:42, Alvaro Diez Gonzalez-Pardo <[email protected]<mailto:[email protected]>> wrote: No it is not, should it? Yes, initialise the bounding box tree outside of a threaded region. Garth Alvaro ________________________________________ From: Garth N. Wells [[email protected]<mailto:[email protected]>] Sent: 25 August 2015 16:25 To: Alvaro Diez Gonzalez-Pardo Subject: Re: [FEniCS] Multi-thread issue when using dolfin Function On 25 August 2015 at 15:23, Alvaro Diez Gonzalez-Pardo <[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>> wrote: Thanks for the advice Unfortunately the code is still not working after calling Mesh::bounding_box_tree(). The conflincting part of the code now looks like this: safeRead.lock(); _detector->get_mesh()->bounding_box_tree(); _detector->get_d_f_grad()->eval(wrap_e_field, wrap_x); _detector->get_w_f_grad()->eval(wrap_w_field, wrap_x); safeRead.unlock(); I don't know the details of how your threading works, but is _detector->get_mesh()->bounding_box_tree(); outside of a threaded region? Garth thanks in advance for any help Alvaro Diez ________________________________________ From: Garth N. Wells [[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>] Sent: 25 August 2015 14:22 To: Alvaro Diez Gonzalez-Pardo Cc: [email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>> Subject: Re: [FEniCS] Multi-thread issue when using dolfin Function It looks like the 'auto' build and caching of the bounding box tree is not thread-safe (we can add this to the reasons why I think the user should be tasked with managing the bounding box tree initialisation). Try calling Mesh::bounding_box_tree() before any calls to Function::eval. Garth On 25 August 2015 at 12:10, Alvaro Diez Gonzalez-Pardo <[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>><mailto:[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>> wrote: Hello, I write to you to report a problem I've been having when trying to use dolfin functions in parallel exectution. Note that this does not refer to computation using dolfin libraries but rather using the results in parallel. you'll find the error report below this lines and as an attachment as well. thanks in advance for the help. Alvaro Diez =======================================ERROR REPORT:===================================================================== I'm trying to parallelize a program that uses dolfin libraries to solve Poisson and Laplace equations. I am using C++11 built-in multithreading class "thread.h". My class Carrier contains: class Carrier { private: [...] SMSDetector * _detector; [...] } where SMSDetector is: class SMSDetector { private: [...] Function _w_f_grad; // function to store the weighting field (vectorial) [...] public: Function * get_w_f_grad(); } When the number of threads used for the simulation is bigger than one (i.e. parallel execution) the program crashes with segmentation fault. The program needs to evaluate, for each Carrier I create a method like: _detector->get_w_f_grad()->eval(wrap_w_field, wrap_x); where the arguments for eval were defined as: Array<double> wrap_x(2, _x.data()); Array<double> wrap_w_field(2, _w_field.data()); The error persists even when mutexes are implemented to avoid race conditions produced when several Carriers try to access the eval(). The error message I get (only the first time I run it after rebooting the machine) is: terminate called after throwing an instance of 'std::runtime_error' what(): *** ------------------------------------------------------------------------- *** DOLFIN encountered an error. If you are not able to resolve this issue *** using the information listed below, you can ask for help at *** *** [email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>><mailto:[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>> *** *** Remember to include the error message listed below and, if possible, *** include a *minimal* running example to reproduce the error. *** *** ------------------------------------------------------------------------- *** Error: Unable to compute collisions with bounding box tree. *** Reason: Bounding box tree has not been built. You need to call tree.build(). *** Where: This error was encountered inside BoundingBoxTree.cpp. *** Process: unknown *** *** DOLFIN version: 1.5.0 *** Git changeset: unknown *** ------------------------------------------------------------------------- [pcssd30:03650] *** Process received signal *** [pcssd30:03650] Signal: Aborted (6) [pcssd30:03650] Signal code: (-6) [pcssd30:03650] [ 0] [0xb7735410] [pcssd30:03650] [ 1] [0xb7735428] [pcssd30:03650] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x47) [0xb57bf607] [pcssd30:03650] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x143) [0xb57c2a33] [pcssd30:03650] [ 4] /usr/lib/i386-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0xb59cfd45] [pcssd30:03650] [ 5] /usr/lib/i386-linux-gnu/libstdc++.so.6(+0x70a33) [0xb59cda33] [pcssd30:03650] [ 6] /usr/lib/i386-linux-gnu/libstdc++.so.6(+0x70aad) [0xb59cdaad] [pcssd30:03650] [ 7] /usr/lib/i386-linux-gnu/libstdc++.so.6(+0x9cc6d) [0xb59f9c6d] [pcssd30:03650] [ 8] /lib/i386-linux-gnu/libpthread.so.0(+0x6f70) [0xb5570f70] [pcssd30:03650] [ 9] /lib/i386-linux-gnu/libc.so.6(clone+0x5e) [0xb587cbee] [pcssd30:03650] *** End of error message *** Aborted (core dumped) When using a debugger (gdb), the error message reads: "Program received signal SIGSEGV, Segmentation fault. 0xb7d90bd4 in dolfin::MeshTopology::dim() const () from /usr/lib/i386-linux-gnu/libdolfin.so.1.5" For more details on how the code works, the full program source-code is available on Github-> https://github.com/AlGepe/TRACS/ an the problematic part seems to be from line 87 of Carrier.cpp to the end of that method, most likely in lines 120 - 125 of that file. _______________________________________________ fenics mailing list [email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>><mailto:[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>> http://fenicsproject.org/mailman/listinfo/fenics _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
