Hi again,

here's another problem which just occured to me while working with the latest 
developer version from GitHub.

I have a program which has to iteratively solve a number of independent 
equation systems.
The actual number of systems may vary between 1 and 10.
Because they are totally independent from each other, we assign each system a 
number of OpenMP threads allowing them to be solved in parallel at the same 
time.

Now I tried to integrate ViennaCL into that scheme and used the 
multithreaded.cpp example as a guide.
The main difference is that I use OpenMP threads instead of Boost ones.

My approach is to first create as many different OpenCL contexts as equation 
systems do exist and split the existing OpenCL devices over them.
If just one device exists, its gonna be used by all contexts, which shouldn't 
be a problem.
Then I call

  viennacl::ocl::switch_context()

for every equation system and allocate all the required VCL vectors and 
matrices required just afterwards.
This makes sure they are assigned and allocated for the context they actually 
belong to.
This all still takes place in serial mode and doesn't bring me in trouble :-)

The actual problems begin, when the solver routine of our programm is called in 
parallel mode.

1.) It fills a preallocated STL matrix with the current coefficients and 
updates the RHS vector.
    These are gonna be copied to their VCL counterparts using
        viennacl::copy(stlmat,vclmat);
    What happens exactly during this call is that

    viennacl::copy()->
      viennacl::detail::copy_impl()->
        gpu_matrix.set()->
          viennacl::backend::memory_create()             (in 
compressed_matrix.hpp at row 519ff)

    allocates 3 new buffers which should be created for the context they belong 
to.
    This could be achieved by calling
         viennacl::ocl::switch_context()
    before the copy takes place.
    Unfortunately this isn't an option because I would have to use a OMP 
Critical region to make sure that the parallel calls to switch_context() & 
copy() do not overlap each other leading to race conditions.
    So what I need is an oportunity to specify a context in which the copy() 
takes place without having to change the actual context.

2.) The same problem happens afterwards when calling the actual Preconditioner 
and solve() routines.
    They too allocate memory for the currently active context.

I built a temporary workaround by putting the whole sequence into a OpenMP 
Critical region which prevents context errors in ViennaCL,
but makes it impossible to solve the equation systems in parallel.

Any suggestions on that?
Kind regards
Andreas Rost




------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to