Hi Pierre,

> I tried to rebuild the optimized PETSc library by changing several
options and ran:

mpirun -np 2 ./ex19 -cuda_show_devices -dm_mat_type aijcusp -dm_vec_type
cusp
-ksp_type fgmres -ksp_view -log_summary -pc_type none
-snes_monitor_short -snes_rtol 1.e-5

Options used:
--with-pthread=1 -O3  -> crash
--with-pthread=0 -O2  -> crash
--with-debugging=1 --with-pthread=1 -O2 -> OK

So --with-debugging=1 is the key to avoid the crash. Not
good for the performance of course...

thanks for the input. I get crashes even when using --with-debugging=1, and valgrind spits out a couple of errors as soon as
  -dm_vec_type cusp
is provided. I'll keep digging, the error is somewhere in the VecScatter routines when using CUSP...

Best regards,
Karli



If it can helps,

Pierre

Previously, I had noticed strange behaviour when running the GPU code
with the threadComm package. It might be worth trying to disable that
code in the build to see if the problem persists?
-Paul


On Tue, Jan 14, 2014 at 9:19 AM, Karl Rupp <[email protected]
<mailto:[email protected]>> wrote:

    Hi Pierre,


    >> I could reproduce the problem and also get some uninitialized
    variable

            warnings in Valgrind. The debug version detects these
            errors, hence
            you only see the errors in the debug build. For the
            optimized build,
            chances are good that the computed values are either wrong
            or may
            become wrong in other environments. I'll see what I can do
            when I'm
            again at GPU machine tomorrow (parallel GPU debugging via
            SSH is not
            great...)

        Sorry, I mean:

        Parallel calculation on CPU or GPU run well with PETSc non
        optimized library
        Parallel calculation on GPU crashes with PETSc optimized
        library (on CPU
        it is OK)


    The fact that it happens to run in one mode out of {debug,
    optimized} but not in the other is at most a lucky coincidence,
    but it still means that this is a bug we need to solve :-)



        I could add that the "mpirun -np 1 ex19" runs well for all
        builds on CPU
        and GPU.


    I see valgrind warnings in the vector scatter routines, which is
    likely the reason why it doesn't work with multiple MPI ranks.

    Best regards,
    Karli




--
*Trio_U support team*
Marthe ROUX (Saclay)
Pierre LEDAC (Grenoble)

Reply via email to