Hi again, here's another problem which just occured to me while working with the latest developer version from GitHub.
I have a program which has to iteratively solve a number of independent equation systems. The actual number of systems may vary between 1 and 10. Because they are totally independent from each other, we assign each system a number of OpenMP threads allowing them to be solved in parallel at the same time. Now I tried to integrate ViennaCL into that scheme and used the multithreaded.cpp example as a guide. The main difference is that I use OpenMP threads instead of Boost ones. My approach is to first create as many different OpenCL contexts as equation systems do exist and split the existing OpenCL devices over them. If just one device exists, its gonna be used by all contexts, which shouldn't be a problem. Then I call viennacl::ocl::switch_context() for every equation system and allocate all the required VCL vectors and matrices required just afterwards. This makes sure they are assigned and allocated for the context they actually belong to. This all still takes place in serial mode and doesn't bring me in trouble :-) The actual problems begin, when the solver routine of our programm is called in parallel mode. 1.) It fills a preallocated STL matrix with the current coefficients and updates the RHS vector. These are gonna be copied to their VCL counterparts using viennacl::copy(stlmat,vclmat); What happens exactly during this call is that viennacl::copy()-> viennacl::detail::copy_impl()-> gpu_matrix.set()-> viennacl::backend::memory_create() (in compressed_matrix.hpp at row 519ff) allocates 3 new buffers which should be created for the context they belong to. This could be achieved by calling viennacl::ocl::switch_context() before the copy takes place. Unfortunately this isn't an option because I would have to use a OMP Critical region to make sure that the parallel calls to switch_context() & copy() do not overlap each other leading to race conditions. So what I need is an oportunity to specify a context in which the copy() takes place without having to change the actual context. 2.) The same problem happens afterwards when calling the actual Preconditioner and solve() routines. They too allocate memory for the currently active context. I built a temporary workaround by putting the whole sequence into a OpenMP Critical region which prevents context errors in ViennaCL, but makes it impossible to solve the equation systems in parallel. Any suggestions on that? Kind regards Andreas Rost ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel