On Fri, Jan 28, 2022 at 3:27 PM Mark Adams wrote:
> (Junchao), Ruipeng said this was an OOM error and suggested trying the
> in-house with
>
> SpGEMM: HYPRE_SetSpGemmUseCusparse(FALSE);
>
> Should I clone a hypre argument to make a -pc_hypre_use_tpl_spgemm or
> -pc_hypre_use_cusparse_spgemm ?
>
(Junchao), Ruipeng said this was an OOM error and suggested trying the
in-house with
SpGEMM: HYPRE_SetSpGemmUseCusparse(FALSE);
Should I clone a hypre argument to make a -pc_hypre_use_tpl_spgemm or
-pc_hypre_use_cusparse_spgemm ?
On Fri, Jan 28, 2022 at 1:42 PM Mark Adams wrote:
> Moving
Moving this to the users list (We can not talk about Crusher on public
forums, but this is on Summit. I had to check this thread carefully!)
Treb is using hypre on Summit and getting this error:
CUSPARSE ERROR (code = 11, insufficient resources) at
csr_spgemm_device_cusparse.c:128
This is
Let's move Crusher stuff to petsc-maint. If top/htop doesn't make it obvious
why there is no memory, I think you should follow up with OLCF support.
Mark Adams writes:
> Something is very messed up on Crusher. I've never seen this "Cannot
> allocate memory", but see it for everything:
>
>
And, building in my home directory looks fine. Looks like a problem with
the scratch directory.
On Fri, Jan 28, 2022 at 1:15 PM Mark Adams wrote:
> Something is very messed up on Crusher. I've never seen this "Cannot
> allocate memory", but see it for everything:
>
> 13:11 1 main=
Something is very messed up on Crusher. I've never seen this "Cannot
allocate memory", but see it for everything:
13:11 1 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ ll
ls: cannot read symbolic link 'configure.log.bkp': Cannot allocate memory
ls: cannot read symbolic link 'make.log':
Crusher has been giving me fits and now I get this error (empty log)
13:01 main *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$
../arch-olcf-crusher.py
Traceback (most recent call last):
File "../arch-olcf-crusher.py", line 55, in
configure.petsc_configure(configure_options)
Hi Hao,
Strange… these must be internal errors to CUDA, as we don’t call any any of
these directly. FYI, CUDA has 2 API’s:
1. The simpler “runtime” API (which we use) which handles many things like CUDA
context management and the like behind the scenes.
2. The more advanced “driver” API, where