Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-31 Thread Fande Kong
Sorry for the confusion. I thought I explained pretty well :-)

Good:

PETSc was linked to  /usr/lib64/libcuda for libcuda

Bad:

PETSc was linked
to 
/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs
for libcuda

My question would be: where should I look for libcuda?

Our HPC admin told me that I should use the one from  /usr/lib64/libcuda

I am trying to understand why we need to link to "stubs"?

Just to be clear, I am fine with PETSc-main as is since I can use a compute
node to compile PETSc.  However, here I am trying really hard to
understand where I should look for the right libcuda.

Thanks for your help

Fande


On Mon, Jan 31, 2022 at 9:19 AM Junchao Zhang 
wrote:

> Fande,
>   From your configure_main.log
>
> cuda:
>   Version:  10.1
>   Includes:
> -I/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/include
>   Library:
>  
> -Wl,-rpath,/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64
> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64
> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs
> -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
>
>
> You can see the `stubs` directory is not in rpath. We took a lot of
> effort to achieve that. You need to double check the reason.
>
> --Junchao Zhang
>
>
> On Mon, Jan 31, 2022 at 9:40 AM Fande Kong  wrote:
>
>> OK,
>>
>> Finally we resolved the issue.  The issue was that there were two libcuda
>> libs on a GPU compute node:  /usr/lib64/libcuda
>> and 
>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
>> But on a login node there is one libcuda lib:
>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
>> We can not see  /usr/lib64/libcuda from a login node where I was compiling
>> the code.
>>
>> Before the Junchao's commit, we did not have  "-Wl,-rpath" to force PETSc
>> take
>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
>> A code compiled on a login node could correctly pick up the cuda lib
>> from  /usr/lib64/libcuda at runtime.  When with "-Wl,-rpath", the code
>> always  takes the cuda lib from
>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda,
>> wihch was a bad lib.
>>
>> Right now, I just compiled code on a compute node instead of a login
>> node, PETSc was able to pick up the correct lib from  /usr/lib64/libcuda,
>> and everything ran fine.
>>
>> I am not sure whether or not it is a good idea to search for "stubs"
>> since the system might have the correct ones in other places.  Should not I
>> do a batch compiling?
>>
>> Thanks,
>>
>> Fande
>>
>>
>> On Wed, Jan 26, 2022 at 1:49 PM Fande Kong  wrote:
>>
>>> Yes, please see the attached file.
>>>
>>> Fande
>>>
>>> On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang 
>>> wrote:
>>>
>>>> Do you have the configure.log with main?
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong 
>>>> wrote:
>>>>
>>>>> I am on the petsc-main
>>>>>
>>>>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6
>>>>>
>>>>> Merge: 96c919c d5f3255
>>>>>
>>>>> Author: Satish Balay 
>>>>>
>>>>> Date:   Wed Jan 26 10:28:32 2022 -0600
>>>>>
>>>>>
>>>>> Merge remote-tracking branch 'origin/release'
>>>>>
>>>>>
>>>>> It is still broken.
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> Fande
>>>>>
>>>>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang 
>>>>> wrote:
>>>>>
>>>>>> The good uses the compiler's default library/header path.  The bad
>>>>>> searches from cuda toolkit path and uses rpath linking.
>>>>>> Though the paths look the same on the login node, they could have
>>>>>> different behavior on a compute node depending on its environment.
>>>>>> I think we fixed the issue in cuda.py (i.e., first try the compiler's
>

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-31 Thread Fande Kong
OK,

Finally we resolved the issue.  The issue was that there were two libcuda
libs on a GPU compute node:  /usr/lib64/libcuda
and 
/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
But on a login node there is one libcuda lib:
/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
We can not see  /usr/lib64/libcuda from a login node where I was compiling
the code.

Before the Junchao's commit, we did not have  "-Wl,-rpath" to force PETSc
take
/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda.
A code compiled on a login node could correctly pick up the cuda lib
from  /usr/lib64/libcuda at runtime.  When with "-Wl,-rpath", the code
always  takes the cuda lib from
/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda,
wihch was a bad lib.

Right now, I just compiled code on a compute node instead of a login node,
PETSc was able to pick up the correct lib from  /usr/lib64/libcuda, and
everything ran fine.

I am not sure whether or not it is a good idea to search for "stubs" since
the system might have the correct ones in other places.  Should not I do a
batch compiling?

Thanks,

Fande


On Wed, Jan 26, 2022 at 1:49 PM Fande Kong  wrote:

> Yes, please see the attached file.
>
> Fande
>
> On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang 
> wrote:
>
>> Do you have the configure.log with main?
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong  wrote:
>>
>>> I am on the petsc-main
>>>
>>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6
>>>
>>> Merge: 96c919c d5f3255
>>>
>>> Author: Satish Balay 
>>>
>>> Date:   Wed Jan 26 10:28:32 2022 -0600
>>>
>>>
>>> Merge remote-tracking branch 'origin/release'
>>>
>>>
>>> It is still broken.
>>>
>>> Thanks,
>>>
>>>
>>> Fande
>>>
>>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang 
>>> wrote:
>>>
>>>> The good uses the compiler's default library/header path.  The bad
>>>> searches from cuda toolkit path and uses rpath linking.
>>>> Though the paths look the same on the login node, they could have
>>>> different behavior on a compute node depending on its environment.
>>>> I think we fixed the issue in cuda.py (i.e., first try the compiler's
>>>> default, then toolkit).  That's why I wanted Fande to use petsc/main.
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith  wrote:
>>>>
>>>>>
>>>>> bad has extra
>>>>>
>>>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs
>>>>>  -lcuda
>>>>>
>>>>> good does not.
>>>>>
>>>>> Try removing the stubs directory and -lcuda from the bad
>>>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start 
>>>>> working.
>>>>>
>>>>> Barry
>>>>>
>>>>> I never liked the stubs stuff.
>>>>>
>>>>> On Jan 25, 2022, at 11:29 PM, Fande Kong  wrote:
>>>>>
>>>>> Hi Junchao,
>>>>>
>>>>> I attached a "bad" configure log and a "good" configure log.
>>>>>
>>>>> The "bad" one was on produced
>>>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c
>>>>>
>>>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683
>>>>>
>>>>> This good hash is the last good hash that is just the right before the
>>>>> bad one.
>>>>>
>>>>> I think you could do a comparison  between these two logs, and check
>>>>> what the differences were.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Fande
>>>>>
>>>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang 
>>>>> wrote:
>>>>>
>>>>>> Fande, could you send the configure.log that works (i.e., before this
>>>>>> offending commit)?
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong 
>>>>>> wrote

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-26 Thread Fande Kong
I am on the petsc-main

commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6

Merge: 96c919c d5f3255

Author: Satish Balay 

Date:   Wed Jan 26 10:28:32 2022 -0600


Merge remote-tracking branch 'origin/release'


It is still broken.

Thanks,


Fande

On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang 
wrote:

> The good uses the compiler's default library/header path.  The bad
> searches from cuda toolkit path and uses rpath linking.
> Though the paths look the same on the login node, they could have
> different behavior on a compute node depending on its environment.
> I think we fixed the issue in cuda.py (i.e., first try the compiler's
> default, then toolkit).  That's why I wanted Fande to use petsc/main.
>
> --Junchao Zhang
>
>
> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith  wrote:
>
>>
>> bad has extra
>>
>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs
>>  -lcuda
>>
>> good does not.
>>
>> Try removing the stubs directory and -lcuda from the bad
>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working.
>>
>> Barry
>>
>> I never liked the stubs stuff.
>>
>> On Jan 25, 2022, at 11:29 PM, Fande Kong  wrote:
>>
>> Hi Junchao,
>>
>> I attached a "bad" configure log and a "good" configure log.
>>
>> The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c
>>
>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683
>>
>> This good hash is the last good hash that is just the right before the
>> bad one.
>>
>> I think you could do a comparison  between these two logs, and check what
>> the differences were.
>>
>> Thanks,
>>
>> Fande
>>
>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang 
>> wrote:
>>
>>> Fande, could you send the configure.log that works (i.e., before this
>>> offending commit)?
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong  wrote:
>>>
>>>> Not sure if this is helpful. I did "git bisect", and here was the
>>>> result:
>>>>
>>>> [kongf@sawtooth2 petsc]$ git bisect bad
>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit
>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c
>>>> Author: Junchao Zhang 
>>>> Date:   Wed Oct 13 05:32:43 2021 +
>>>>
>>>> Config: fix CUDA library and header dirs
>>>>
>>>> :04 04 187c86055adb80f53c1d0565a704fec43a96
>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config
>>>>
>>>>
>>>> Started from this commit, and GPU did not work for me on our HPC
>>>>
>>>> Thanks,
>>>> Fande
>>>>
>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch <
>>>>> jacob....@gmail.com> wrote:
>>>>>
>>>>>> Configure should not have an impact here I think. The reason I had
>>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in
>>>>>> fact the only CUDA call) in the initialization sequence that returns the
>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem
>>>>>> with oversubscribing GPU’s? In the runs that crash, how many ranks are
>>>>>> using any given GPU  at once? Maybe MPS is required.
>>>>>>
>>>>>
>>>>> I used one MPI rank.
>>>>>
>>>>> Fande
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jacob Faibussowitsch
>>>>>> (Jacob Fai - booss - oh - vitch)
>>>>>>
>>>>>> On Jan 21, 2022, at 12:01, Fande Kong  wrote:
>>>>>>
>>>>>> Thanks Jacob,
>>>>>>
>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch <
>>>>>> jacob@gmail.com> wrote:
>>>>>>
>>>>>>> Segfault is caused by the following check at
>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a
>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely():
>>>>>>&g

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-26 Thread Fande Kong
On Tue, Jan 25, 2022 at 10:59 PM Barry Smith  wrote:

>
> bad has extra
>
> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs
>  -lcuda
>
> good does not.
>
> Try removing the stubs directory and -lcuda from the bad
> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working.
>

It seems I still got the same issue after removing stubs directory and
-lcuda.

Thanks,
Fande


>
> Barry
>
> I never liked the stubs stuff.
>
> On Jan 25, 2022, at 11:29 PM, Fande Kong  wrote:
>
> Hi Junchao,
>
> I attached a "bad" configure log and a "good" configure log.
>
> The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c
>
> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683
>
> This good hash is the last good hash that is just the right before the bad
> one.
>
> I think you could do a comparison  between these two logs, and check what
> the differences were.
>
> Thanks,
>
> Fande
>
> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang 
> wrote:
>
>> Fande, could you send the configure.log that works (i.e., before this
>> offending commit)?
>> --Junchao Zhang
>>
>>
>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong  wrote:
>>
>>> Not sure if this is helpful. I did "git bisect", and here was the result:
>>>
>>> [kongf@sawtooth2 petsc]$ git bisect bad
>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit
>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c
>>> Author: Junchao Zhang 
>>> Date:   Wed Oct 13 05:32:43 2021 +
>>>
>>> Config: fix CUDA library and header dirs
>>>
>>> :04 04 187c86055adb80f53c1d0565a704fec43a96
>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config
>>>
>>>
>>> Started from this commit, and GPU did not work for me on our HPC
>>>
>>> Thanks,
>>> Fande
>>>
>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong  wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch <
>>>> jacob@gmail.com> wrote:
>>>>
>>>>> Configure should not have an impact here I think. The reason I had you
>>>>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact
>>>>> the only CUDA call) in the initialization sequence that returns the error
>>>>> code. There should be no prior CUDA calls. Maybe this is a problem with
>>>>> oversubscribing GPU’s? In the runs that crash, how many ranks are using 
>>>>> any
>>>>> given GPU  at once? Maybe MPS is required.
>>>>>
>>>>
>>>> I used one MPI rank.
>>>>
>>>> Fande
>>>>
>>>>
>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jacob Faibussowitsch
>>>>> (Jacob Fai - booss - oh - vitch)
>>>>>
>>>>> On Jan 21, 2022, at 12:01, Fande Kong  wrote:
>>>>>
>>>>> Thanks Jacob,
>>>>>
>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch <
>>>>> jacob@gmail.com> wrote:
>>>>>
>>>>>> Segfault is caused by the following check at
>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a
>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely():
>>>>>>
>>>>>> ```
>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in
>>>>>> fact < 0 here and uncaught
>>>>>> ```
>>>>>>
>>>>>> To clarify:
>>>>>>
>>>>>> “lazy” initialization is not that lazy after all, it still does some
>>>>>> 50% of the initialization that “eager” initialization does. It stops 
>>>>>> short
>>>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device
>>>>>> data, and initializing cublas and friends. Lazy also importantly swallows
>>>>>> any errors that crop up during initialization, storing the resulting 
>>>>>> error
>>>>>> code for later (specifically _defaultDevice = -init_error_value;).
>>>>>>
>>>>>> So whether you initialize lazily or eagerly makes no difference here,
>>>>&g

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-25 Thread Fande Kong
Not sure if this is helpful. I did "git bisect", and here was the result:

[kongf@sawtooth2 petsc]$ git bisect bad
246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit
commit 246ba74192519a5f34fb6e227d1c64364e19ce2c
Author: Junchao Zhang 
Date:   Wed Oct 13 05:32:43 2021 +

Config: fix CUDA library and header dirs

:04 04 187c86055adb80f53c1d0565a704fec43a96
ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config


Started from this commit, and GPU did not work for me on our HPC

Thanks,
Fande

On Tue, Jan 25, 2022 at 7:18 PM Fande Kong  wrote:

>
>
> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch 
> wrote:
>
>> Configure should not have an impact here I think. The reason I had you
>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact
>> the only CUDA call) in the initialization sequence that returns the error
>> code. There should be no prior CUDA calls. Maybe this is a problem with
>> oversubscribing GPU’s? In the runs that crash, how many ranks are using any
>> given GPU  at once? Maybe MPS is required.
>>
>
> I used one MPI rank.
>
> Fande
>
>
>
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>>
>> On Jan 21, 2022, at 12:01, Fande Kong  wrote:
>>
>> Thanks Jacob,
>>
>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch 
>> wrote:
>>
>>> Segfault is caused by the following check at
>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a
>>> PetscUnlikelyDebug() rather than just PetscUnlikely():
>>>
>>> ```
>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in
>>> fact < 0 here and uncaught
>>> ```
>>>
>>> To clarify:
>>>
>>> “lazy” initialization is not that lazy after all, it still does some 50%
>>> of the initialization that “eager” initialization does. It stops short
>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device
>>> data, and initializing cublas and friends. Lazy also importantly swallows
>>> any errors that crop up during initialization, storing the resulting error
>>> code for later (specifically _defaultDevice = -init_error_value;).
>>>
>>> So whether you initialize lazily or eagerly makes no difference here, as
>>> _defaultDevice will always contain -35.
>>>
>>> The bigger question is why cudaGetDeviceCount() is returning
>>> cudaErrorInsufficientDriver. Can you compile and run
>>>
>>> ```
>>> #include 
>>>
>>> int main()
>>> {
>>>   int ndev;
>>>   return cudaGetDeviceCount():
>>> }
>>> ```
>>>
>>> Then show the value of "echo $?”?
>>>
>>
>> Modify your code a little to get more information.
>>
>> #include 
>> #include 
>>
>> int main()
>> {
>>   int ndev;
>>   int error = cudaGetDeviceCount();
>>   printf("ndev %d \n", ndev);
>>   printf("error %d \n", error);
>>   return 0;
>> }
>>
>> Results:
>>
>> $ ./a.out
>> ndev 4
>> error 0
>>
>>
>> I have not read the PETSc cuda initialization code yet. If I need to
>> guess at what was happening. I will naively think that PETSc did not get
>> correct GPU information in the configuration because the compiler node does
>> not have GPUs, and there was no way to get any GPU device information.
>>
>> During the runtime on GPU nodes, PETSc might have incorrect information
>> grabbed during configuration and had this kind of false error message.
>>
>> Thanks,
>>
>> Fande
>>
>>
>>
>>>
>>> Best regards,
>>>
>>> Jacob Faibussowitsch
>>> (Jacob Fai - booss - oh - vitch)
>>>
>>> On Jan 20, 2022, at 17:47, Matthew Knepley  wrote:
>>>
>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong  wrote:
>>>
>>>> Thanks, Jed
>>>>
>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown  wrote:
>>>>
>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node
>>>>> without a GPU.
>>>>
>>>>
>>>> I am running the code on compute nodes that do have GPUs.
>>>>
>>>
>>> If you are actually running on GPUs, why would you need lazy
>>> initialization? It would not break with GPUs present.
>>>
>>>Matt
>>>
>>>
>>>>

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-25 Thread Fande Kong
On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch 
wrote:

> Configure should not have an impact here I think. The reason I had you run
> `cudaGetDeviceCount()` is because this is the CUDA call (and in fact the
> only CUDA call) in the initialization sequence that returns the error code.
> There should be no prior CUDA calls. Maybe this is a problem with
> oversubscribing GPU’s? In the runs that crash, how many ranks are using any
> given GPU  at once? Maybe MPS is required.
>

I used one MPI rank.

Fande



>
> Best regards,
>
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
>
> On Jan 21, 2022, at 12:01, Fande Kong  wrote:
>
> Thanks Jacob,
>
> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch 
> wrote:
>
>> Segfault is caused by the following check at
>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a
>> PetscUnlikelyDebug() rather than just PetscUnlikely():
>>
>> ```
>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in
>> fact < 0 here and uncaught
>> ```
>>
>> To clarify:
>>
>> “lazy” initialization is not that lazy after all, it still does some 50%
>> of the initialization that “eager” initialization does. It stops short
>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device
>> data, and initializing cublas and friends. Lazy also importantly swallows
>> any errors that crop up during initialization, storing the resulting error
>> code for later (specifically _defaultDevice = -init_error_value;).
>>
>> So whether you initialize lazily or eagerly makes no difference here, as
>> _defaultDevice will always contain -35.
>>
>> The bigger question is why cudaGetDeviceCount() is returning
>> cudaErrorInsufficientDriver. Can you compile and run
>>
>> ```
>> #include 
>>
>> int main()
>> {
>>   int ndev;
>>   return cudaGetDeviceCount():
>> }
>> ```
>>
>> Then show the value of "echo $?”?
>>
>
> Modify your code a little to get more information.
>
> #include 
> #include 
>
> int main()
> {
>   int ndev;
>   int error = cudaGetDeviceCount();
>   printf("ndev %d \n", ndev);
>   printf("error %d \n", error);
>   return 0;
> }
>
> Results:
>
> $ ./a.out
> ndev 4
> error 0
>
>
> I have not read the PETSc cuda initialization code yet. If I need to guess
> at what was happening. I will naively think that PETSc did not get correct
> GPU information in the configuration because the compiler node does not
> have GPUs, and there was no way to get any GPU device information.
>
> During the runtime on GPU nodes, PETSc might have incorrect information
> grabbed during configuration and had this kind of false error message.
>
> Thanks,
>
> Fande
>
>
>
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>>
>> On Jan 20, 2022, at 17:47, Matthew Knepley  wrote:
>>
>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong  wrote:
>>
>>> Thanks, Jed
>>>
>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown  wrote:
>>>
>>>> You can't create CUDA or Kokkos Vecs if you're running on a node
>>>> without a GPU.
>>>
>>>
>>> I am running the code on compute nodes that do have GPUs.
>>>
>>
>> If you are actually running on GPUs, why would you need lazy
>> initialization? It would not break with GPUs present.
>>
>>Matt
>>
>>
>>> With PETSc-3.16.1, I  got good speedup by running GAMG on GPUs.  That
>>> might be a bug of PETSc-main.
>>>
>>> Thanks,
>>>
>>> Fande
>>>
>>>
>>>
>>> KSPSetUp  13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00
>>> 0.0e+00  0  5  0  0  0   0  5  0  0  0  3140   64630 15 1.05e+025
>>> 3.49e+01 100
>>> KSPSolve   1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00
>>> 0.0e+00  0 87  0  0  0   0 87  0  0  0 34522   69556  4 4.35e-031
>>> 2.38e-03 100
>>> KSPGMRESOrthog   142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00
>>> 0.0e+00  0 27  0  0  0   0 27  0  0  0 83755   87801  0 0.00e+000
>>> 0.00e+00 100
>>> SNESSolve  1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00
>>> 0.0e+00 21100  0  0  0  21100  0  0  0   901   51365 57 1.10e+03   52
>>> 8.78e+02 100
>>> SNESSetUp  1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 0.0e+00  0  0  0  0  0

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-20 Thread Fande Kong
04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  043 165  1 4.49e-031
1.19e-02 100
PCSetUp2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00
0.0e+00  5 12  0  0  0   5 12  0  0  0   496   17826 55 1.03e+03   45
6.54e+02 98
PCSetUpOnBlocks   44 1.0 9.9087e-04 1.0 2.88e+03 1.0






> The point of lazy initialization is to make it possible to run a solve
> that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of
> whether a GPU is actually present.
>
> Fande Kong  writes:
>
> > I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs
> > now. Got Segmentation fault.
> >
> > Thanks,
> >
> > Fande
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x2aaab5558b11 in
> >
> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize
> > (this=0x1) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54
> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize()
> noexcept
> > Missing separate debuginfos, use: debuginfo-install
> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64
> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64
> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64
> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64
> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64
> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64
> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64
> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64
> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64
> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64
> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64
> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64
> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64
> > zlib-1.2.7-19.el7_9.x86_64
> > (gdb) bt
> > #0  0x2aaab5558b11 in
> >
> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize
> > (this=0x1) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54
> > #1  0x2aaab5558db7 in
> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice
> > (this=this@entry=0x2aaab7f37b70
> > , device=0x115da00, id=-35, id@entry=-1) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344
> > #2  0x2aaab55577de in PetscDeviceCreate (type=type@entry
> =PETSC_DEVICE_CUDA,
> > devid=devid@entry=-1, device=device@entry=0x2aaab7f37b48
> > ) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107
> > #3  0x2aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal
> > (type=type@entry=PETSC_DEVICE_CUDA,
> defaultDeviceId=defaultDeviceId@entry=-1)
> > at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273
> > #4  0x2aaab5557bf6 in PetscDeviceInitialize
> > (type=type@entry=PETSC_DEVICE_CUDA)
> > at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234
> > #5  0x2aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244
> > #6  0x2aaab5649b40 in VecSetType (vec=vec@entry=0x115d150,
> > method=method@entry=0x2aaab70b45b8 "seqcuda") at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93
> > #7  0x2aaab579c33f in VecCreate_CUDA (v=0x115d150) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/
> > mpicuda.cu:214
> > #8  0x2aaab5649b40 in VecSetType (vec=vec@entry=0x115d150,
> > method=method@entry=0x7fff9260 "cuda") at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93
> > #9  0x2aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150,
> > PetscOptionsObject=0x7fff9210) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263
> > #10 VecSetFromOptions (vec=0x115d150) at
> >
> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297
> > #11 0x2aaab02ef227 in libMesh::PetscVector::init
> > (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL)
> > at
> >
> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installe

Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-20 Thread Fande Kong
I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs
now. Got Segmentation fault.

Thanks,

Fande

Program received signal SIGSEGV, Segmentation fault.
0x2aaab5558b11 in
Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize
(this=0x1) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54
54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() noexcept
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64
elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64
libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64
libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64
libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64
libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64
libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64
libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64
libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64
librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64
librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64
libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64
systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64
zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x2aaab5558b11 in
Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize
(this=0x1) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54
#1  0x2aaab5558db7 in
Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice
(this=this@entry=0x2aaab7f37b70
, device=0x115da00, id=-35, id@entry=-1) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344
#2  0x2aaab55577de in PetscDeviceCreate (type=type@entry=PETSC_DEVICE_CUDA,
devid=devid@entry=-1, device=device@entry=0x2aaab7f37b48
) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107
#3  0x2aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal
(type=type@entry=PETSC_DEVICE_CUDA, defaultDeviceId=defaultDeviceId@entry=-1)
at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273
#4  0x2aaab5557bf6 in PetscDeviceInitialize
(type=type@entry=PETSC_DEVICE_CUDA)
at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234
#5  0x2aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244
#6  0x2aaab5649b40 in VecSetType (vec=vec@entry=0x115d150,
method=method@entry=0x2aaab70b45b8 "seqcuda") at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93
#7  0x2aaab579c33f in VecCreate_CUDA (v=0x115d150) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/
mpicuda.cu:214
#8  0x2aaab5649b40 in VecSetType (vec=vec@entry=0x115d150,
method=method@entry=0x7fff9260 "cuda") at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93
#9  0x2aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150,
PetscOptionsObject=0x7fff9210) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263
#10 VecSetFromOptions (vec=0x115d150) at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297
#11 0x2aaab02ef227 in libMesh::PetscVector::init
(this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL)
at
/home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693

On Thu, Jan 20, 2022 at 1:09 PM Fande Kong  wrote:

> Thanks, Jed,
>
> This worked!
>
> Fande
>
> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown  wrote:
>
>> Fande Kong  writes:
>>
>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch <
>> jacob@gmail.com>
>> > wrote:
>> >
>> >> Are you running on login nodes or compute nodes (I can’t seem to tell
>> from
>> >> the configure.log)?
>> >>
>> >
>> > I was compiling codes on login nodes, and running codes on compute
>> nodes.
>> > Login nodes do not have GPUs, but compute nodes do have GPUs.
>> >
>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1
>> worked
>> > perfectly. I have this trouble with PETSc-main.
>>
>> I assume you can
>>
>> export PETSC_OPTIONS='-device_enable lazy'
>>
>> and it'll work.
>>
>> I think this should be the default. The main complaint is that timing the
>> first GPU-using event isn't accurate if it includes initialization, but I
>> think this is mostly hypothetical because you can't trust any timing that
>> doesn't preload in some form and the first GPU-using event will almost
>> always be something uninteresting so I think it will rarely lead to
>> confusion. Meanwhile, eager initialization is viscerally disruptive for
>> lots of people.
>>
>


Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-20 Thread Fande Kong
Thanks, Jed,

This worked!

Fande

On Wed, Jan 19, 2022 at 11:03 PM Jed Brown  wrote:

> Fande Kong  writes:
>
> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch <
> jacob@gmail.com>
> > wrote:
> >
> >> Are you running on login nodes or compute nodes (I can’t seem to tell
> from
> >> the configure.log)?
> >>
> >
> > I was compiling codes on login nodes, and running codes on compute nodes.
> > Login nodes do not have GPUs, but compute nodes do have GPUs.
> >
> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 worked
> > perfectly. I have this trouble with PETSc-main.
>
> I assume you can
>
> export PETSC_OPTIONS='-device_enable lazy'
>
> and it'll work.
>
> I think this should be the default. The main complaint is that timing the
> first GPU-using event isn't accurate if it includes initialization, but I
> think this is mostly hypothetical because you can't trust any timing that
> doesn't preload in some form and the first GPU-using event will almost
> always be something uninteresting so I think it will rarely lead to
> confusion. Meanwhile, eager initialization is viscerally disruptive for
> lots of people.
>


Re: [petsc-users] Does mpiaijkok intend to support 64-bit integers?

2022-01-20 Thread Fande Kong
Thanks, Mark,

PETSc-main has no issue.

Fande

On Thu, Jan 20, 2022 at 9:14 AM Fande Kong  wrote:

>
>
> On Thu, Jan 20, 2022 at 6:49 AM Mark Adams  wrote:
>
>> Humm, I was not able to reproduce this on my Mac. Trying Crusher now.
>> Are you using main? or even a recent release.
>>
>
> I am working with PETSc-3.16.1
>
> I will try main now
>
> Thanks,
> Fande
>
>
>
>> We did fix a 64 bit int bug recently in mpiaijkok.
>>
>> Thanks,
>> Mark
>>
>> On Thu, Jan 20, 2022 at 12:12 AM Fande Kong  wrote:
>>
>>> Hi All,
>>>
>>> It seems that mpiaijkok does not support 64-bit integers at this time.
>>> Do we have any motivation for this? Or Is it just a bug?
>>>
>>> Thanks,
>>>
>>> Fande
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a
>>> value of type "MatColumnIndexType *" cannot be assigned to an entity of
>>> type "int *"
>>>
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a
>>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>>
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a
>>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>>
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a
>>> value of type "MatColumnIndexType *" cannot be assigned to an entity of
>>> type "int *"
>>>
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a
>>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>>
>>>
>>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a
>>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>>
>>>
>>> 6 errors detected in the compilation of
>>> "/tmp/tmpxft_00017e46_-6_mpiaijkok.kokkos.cpp1.ii".
>>>
>>> gmake[3]: ***
>>> [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o] Error 1
>>>
>>


Re: [petsc-users] Does mpiaijkok intend to support 64-bit integers?

2022-01-20 Thread Fande Kong
On Thu, Jan 20, 2022 at 6:49 AM Mark Adams  wrote:

> Humm, I was not able to reproduce this on my Mac. Trying Crusher now.
> Are you using main? or even a recent release.
>

I am working with PETSc-3.16.1

I will try main now

Thanks,
Fande



> We did fix a 64 bit int bug recently in mpiaijkok.
>
> Thanks,
> Mark
>
> On Thu, Jan 20, 2022 at 12:12 AM Fande Kong  wrote:
>
>> Hi All,
>>
>> It seems that mpiaijkok does not support 64-bit integers at this time. Do
>> we have any motivation for this? Or Is it just a bug?
>>
>> Thanks,
>>
>> Fande
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a
>> value of type "MatColumnIndexType *" cannot be assigned to an entity of
>> type "int *"
>>
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a
>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a
>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a
>> value of type "MatColumnIndexType *" cannot be assigned to an entity of
>> type "int *"
>>
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a
>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>
>>
>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a
>> value of type "PetscInt *" cannot be assigned to an entity of type "int *"
>>
>>
>> 6 errors detected in the compilation of
>> "/tmp/tmpxft_00017e46_-6_mpiaijkok.kokkos.cpp1.ii".
>>
>> gmake[3]: *** [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o]
>> Error 1
>>
>


[petsc-users] Does mpiaijkok intend to support 64-bit integers?

2022-01-19 Thread Fande Kong
Hi All,

It seems that mpiaijkok does not support 64-bit integers at this time. Do
we have any motivation for this? Or Is it just a bug?

Thanks,

Fande

petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a
value of type "MatColumnIndexType *" cannot be assigned to an entity of
type "int *"


petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a
value of type "PetscInt *" cannot be assigned to an entity of type "int *"


petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a
value of type "PetscInt *" cannot be assigned to an entity of type "int *"


petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a
value of type "MatColumnIndexType *" cannot be assigned to an entity of
type "int *"


petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a
value of type "PetscInt *" cannot be assigned to an entity of type "int *"


petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a
value of type "PetscInt *" cannot be assigned to an entity of type "int *"


6 errors detected in the compilation of
"/tmp/tmpxft_00017e46_-6_mpiaijkok.kokkos.cpp1.ii".

gmake[3]: *** [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o]
Error 1


Re: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-19 Thread Fande Kong
On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch 
wrote:

> Are you running on login nodes or compute nodes (I can’t seem to tell from
> the configure.log)?
>

I was compiling codes on login nodes, and running codes on compute nodes.
Login nodes do not have GPUs, but compute nodes do have GPUs.

Just to be clear, the same thing (code, machine) with PETSc-3.16.1 worked
perfectly. I have this trouble with PETSc-main.

I might do "git bisect" when I have time

Thanks,

Fande


If running from login nodes, do they support running with GPU’s? Some
> clusters will install stub versions of cuda runtime on login nodes (such
> that configuration can find them), but that won’t actually work in
> practice.
>
> If this is the case then CUDA will fail to initialize with this exact
> error. IIRC It wasn’t until CUDA 11.1 that they created a specific error
> code (cudaErrorStubLibrary) for it.
>
> Best regards,
>
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
>
> On Jan 19, 2022, at 12:07, Fande Kong  wrote:
>
> Thanks, Jacob, and Junchao
>
> The log was attached.  I am using Sawtooth at INL
> https://hpc.inl.gov/SitePages/Home.aspx
>
>
> Thanks,
>
> Fande
>
> On Wed, Jan 19, 2022 at 10:32 AM Jacob Faibussowitsch 
> wrote:
>
>> Hi Fande,
>>
>> What machine are you running this on? Please attach configure.log so I
>> can troubleshoot this.
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>>
>> On Jan 19, 2022, at 10:04, Fande Kong  wrote:
>>
>> Hi All,
>>
>> Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the
>> following error message:
>>
>> 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i
>> -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda
>>  -log_view
>> [0]PETSC ERROR: - Error Message
>> --
>> [0]PETSC ERROR: Missing or incorrect user input
>> [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in
>> cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is
>> insufficient for CUDA runtime version
>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e  GIT
>> Date: 2022-01-18 16:04:31 +
>> [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named
>> r8i3n0 by kongf Wed Jan 19 08:30:13 2022
>> [0]PETSC ERROR: Configure options --with-debugging=no
>> --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
>> --download-ptscotch=1 --download-parmetis=1 --download-mumps=1
>> --download-strumpack=1 --download-scalapack=1 --download-slepc=1
>> --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0
>> --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda
>> --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1
>> [0]PETSC ERROR: #1 initialize() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298
>> [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299
>> [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425
>> [0]PETSC ERROR: #4 PetscInitialize_Common() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963
>> [0]PETSC ERROR: #5 PetscInitialize() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238
>> [0]PETSC ERROR: #6 SlepcInitialize() at
>> /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275
>> [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522
>> [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called
>> MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2)
>>
>> Thanks,
>>
>> Fande
>>
>>
>> 
>
>
>


[petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

2022-01-19 Thread Fande Kong
Hi All,

Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the
following error message:

2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i
-use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda
 -log_view
[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: Missing or incorrect user input
[0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in cuda
error 35 (cudaErrorInsufficientDriver) : CUDA driver version is
insufficient for CUDA runtime version
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e  GIT
Date: 2022-01-18 16:04:31 +
[0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named r8i3n0
by kongf Wed Jan 19 08:30:13 2022
[0]PETSC ERROR: Configure options --with-debugging=no
--with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
--download-ptscotch=1 --download-parmetis=1 --download-mumps=1
--download-strumpack=1 --download-scalapack=1 --download-slepc=1
--with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0
--with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda
--with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1
[0]PETSC ERROR: #1 initialize() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298
[0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299
[0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425
[0]PETSC ERROR: #4 PetscInitialize_Common() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963
[0]PETSC ERROR: #5 PetscInitialize() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238
[0]PETSC ERROR: #6 SlepcInitialize() at
/home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275
[0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522
[r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called
MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2)

Thanks,

Fande


Re: [petsc-users] Downloaded superlu_dist could not be used. Please check install in $PREFIX

2022-01-19 Thread Fande Kong
Thanks, Sherry, and Satish,

I will try your suggestion, and report back to you as soon as possible.

Thanks,

Fande

On Tue, Jan 18, 2022 at 10:48 PM Satish Balay  wrote:

> Sherry,
>
> This is with superlu-dist-7.1.1 [not master branch]
>
>
> Fande,
>
> >>>>>>
> Executing: mpifort  -o /tmp/petsc-UYa6A8/config.compilers/conftest
> -fopenmp -fopenmp  -I$PREFIX/include -fPIC -O3  -fopenmp
> /tmp/petsc-UYa6A8/config.compilers/conftest.o
> /tmp/petsc-UYa6A8/config.compilers/confc.o  -Wl,-rpath,$PREFIX/lib
> -L$PREFIX/lib -lsuperlu_dist -lpthread -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib
> -lparmetis -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lmetis
> -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lflapack -Wl,-rpath,$PREFIX/lib
> -L$PREFIX/lib -lfblas -lm -Wl,-rpath,$BUILD_PREFIX/lib -L$BUILD_PREFIX/lib
> -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm
> -Wl,-rpath,$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0
> -L$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0
> -Wl,-rpath,$BUILD_PREFIX/lib/gcc -L$BUILD_PREFIX/lib/gcc
> -Wl,-rpath,$BUILD_PREFIX/x86_64-conda-linux-gnu/lib
> -L$BUILD_PREFIX/x86_64-conda-linux-gnu/lib -Wl,-rpath,$BUILD_PREFIX/lib
> -lgfortran -lm -lgcc_s -lquadmath -lrt -lquadmath -lstdc++ -ldl
> Possible ERROR while running linker:
> stderr:
> $BUILD_PREFIX/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld:
> warning: libmpicxx.so.12, needed by $PREFIX/lib/libsuperlu_dist.so, not
> found (try using -rpath or -rpath-link)
> <<<
>
> I don't really understand why this error comes up [as with shared
> libraries we should be able to link with -lsuperlu_dist - without having to
> link with libmpicxx.so.12
>
> What do you get for:
>
> ldd $PREFIX/lib/libstdc++.so
>
>
> BTW: is configure.log modified to replace realpaths with $PREFIX
> $BUILD_PREFIX etc?
>
> Can you try additional configure option LIBS=-lmpicxx and see if that
> works around this problem?
>
> Satish
>
> On Tue, 18 Jan 2022, Xiaoye S. Li wrote:
>
> > There was a merge error in the master branch. I fixed it today. Not sure
> > whether that's causing your problem.   Can you try now?
> >
> > Sherry
> >
> > On Mon, Jan 17, 2022 at 11:55 AM Fande Kong  wrote:
> >
> > > I am trying to port PETSc-3.16.3 to the MOOSE ecosystem. I got an error
> > > that PETSc could not build  superlu_dist.  The log file was attached.
> > >
> > > PETSc-3.15.x worked correctly in the same environment.
> > >
> > > Thanks,
> > > Fande
> > >
> >
>
>


Re: [petsc-users] Finite difference approximation of Jacobian

2022-01-11 Thread Fande Kong
This is something I almost started a while ago. 

https://gitlab.com/petsc/petsc/-/issues/852

It would be a very interesting addition to us. 


Fande


> On Jan 12, 2022, at 12:04 AM, Barry Smith  wrote:
> 
> 
>  Why does it need to handle values? 
> 
>> On Jan 12, 2022, at 12:43 AM, Jed Brown  wrote:
>> 
>> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I 
>> think it should be default if no preallocation functions are called. But it 
>> isn't exactly the MATPREALLOCATOR code because it needs to handle values 
>> too. Should not be a lot of code and will essentially remove this FAQ and 
>> one of the most irritating subtle aspects of new codes using PETSc matrices.
>> 
>> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up
>> 
>> Barry Smith  writes:
>> 
>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and 
>>> would prefer it was just functionality that Mat provided directly; for 
>>> example MatSetOption(mat, preallocator_mode,true); matsetvalues,... 
>>> MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of 
>>> whatever type the user set initially, aij, baij, sbaij,  And the user 
>>> can now use it efficiently.
>>> 
>>> Barry
>>> 
>>> So turning on the option just swaps out temporarily the operations for 
>>> MatSetValues and AssemblyBegin/End to be essentially those in 
>>> MATPREALLOCATOR. The refactorization should take almost no time and would 
>>> be faster than trying to rig dmstag to use MATPREALLOCATOR as is.
>>> 
>>> 
 On Jan 11, 2022, at 9:43 PM, Matthew Knepley  wrote:
 
 On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan >>> > wrote:
 Working on doing this incrementally, in progress here: 
 https://gitlab.com/petsc/petsc/-/merge_requests/4712 
 
 
 This works in 1D for AIJ matrices, assembling a matrix with a maximal 
 number of zero entries as dictated by the stencil width (which is intended 
 to be very very close to what DMDA would do if you 
 associated all the unknowns with a particular grid point, which is the way 
 DMStag largely works under the hood).
 
 Dave, before I get into it, am I correct in my understanding that 
 MATPREALLOCATOR would be better here because you would avoid superfluous 
 zeros in the sparsity pattern,
 because this routine wouldn't have to assemble the Mat returned by 
 DMCreateMatrix()?
 
 Yes, here is how it works. You throw in all the nonzeros you come across. 
 Preallocator is a hash table that can check for duplicates. At the end, it 
 returns the sparsity pattern.
 
 Thanks,
 
Matt
 
 If this seems like a sane way to go, I will continue to add some more 
 tests (in particular periodic BCs not tested yet) and add the code for 2D 
 and 3D. 
 
 
 
 Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May 
 mailto:dave.mayhe...@gmail.com>>:
 
 
 On Mon, 13 Dec 2021 at 20:13, Matthew Knepley >>> > wrote:
 On Mon, Dec 13, 2021 at 1:52 PM Dave May >>> > wrote:
 On Mon, 13 Dec 2021 at 19:29, Matthew Knepley >>> > wrote:
 On Mon, Dec 13, 2021 at 1:16 PM Dave May >>> > wrote:
 
 
 On Sat 11. Dec 2021 at 22:28, Matthew Knepley >>> > wrote:
 On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi >>> > wrote:
 Hi,
 Does anyone have comment on finite difference coloring with DMStag? We are 
 using DMStag and TS to evolve some nonlinear equations implicitly. It 
 would be helpful to have the coloring Jacobian option with that.
 
 Since DMStag produces the Jacobian connectivity,
 
 This is incorrect.
 The DMCreateMatrix implementation for DMSTAG only sets the number of 
 nonzeros (very inaccurately). It does not insert any zero values and thus 
 the nonzero structure is actually not defined. 
 That is why coloring doesn’t work.
 
 Ah, thanks Dave.
 
 Okay, we should fix that.It is perfectly possible to compute the nonzero 
 pattern from the DMStag information.
 
 Agreed. The API for DMSTAG is complete enough to enable one to
 loop over the cells, and for all quantities defined on the cell (centre, 
 face, vertex), 
 insert values into the appropriate slot in the matrix. 
 Combined with MATPREALLOCATOR, I believe a compact and readable
 code should be possible to write for the preallocation (cf DMDA).
 
 I think the only caveat with the approach of using all quantities defined 
 on the cell is 
 It may slightly over allocate depending on how the user 

Re: [petsc-users] Error running make on MUMPS

2021-11-11 Thread Fande Kong
On Thu, Nov 11, 2021 at 1:59 PM Matthew Knepley  wrote:

> On Thu, Nov 11, 2021 at 3:44 PM Fande Kong  wrote:
>
>> Thanks Matt,
>>
>> I understand completely, the actual error should be
>>
>> "
>> ln -s libHYPRE_parcsr_ls-2.20.0.so libHYPRE_parcsr_ls.so gmake[1]:
>> Leaving directory
>> `/beegfs1/home/anovak/cardinal/contrib/moose/petsc/arch-moose/externalpackages/git.hypre/src/parcsr_ls'
>>   Error running make; make install on HYPRE: Could not execute
>> "['/usr/bin/gmake install']":
>> "
>>
>> The one you saw was that hypre automatically took the second try for
>> parcsr_ls after the first failed gmake install. Because the first try
>> already did  "ln -s libHYPRE_parcsr_ls-2.20.0.so libHYPRE_parcsr_ls.so",
>> and then second try would see that ‘libHYPRE_parcsr_ls.so" already existed.
>>
>
> Are you completely sure? This just looks like a cascading make error,
> namely that one shell thing failed (ln) and then the make reports an error.
>

Oh, sorry. It was a typo. I intended to say "If I understand correctly, the
actual should be balabala .."

Thanks for the explanation!  I was confused by the double outputs. I got it
now.

Thanks, again

Fande


>
>   Thanks,
>
>  Matt
>
>
>> Thanks,
>> Fande
>>
>> On Thu, Nov 11, 2021 at 1:29 PM Matthew Knepley 
>> wrote:
>>
>>> On Thu, Nov 11, 2021 at 3:25 PM Fande Kong  wrote:
>>>
>>>> Thanks, Satish
>>>>
>>>> "--with-make-np=1"  did help us on MUMPS, but we had new trouble with
>>>> hypre now.
>>>>
>>>> It is hard to understand why "gmake install" even failed.
>>>>
>>>
>>> Because HYPRE thinks it is better to use 'ln' than the 'install' script
>>> that handles things like the target existing:
>>>
>>> gmake[1]: Leaving directory
>>> `/beegfs1/home/anovak/cardinal/contrib/moose/petsc/arch-moose/externalpackages/git.hypre/src/parcsr_ls'
>>> ln: failed to create symbolic link ‘libHYPRE_parcsr_ls.so’: File exists
>>> gmake[1]: *** [libHYPRE_parcsr_ls.so] Error 1
>>> gmake: *** [all] Error 1
>>>
>>> I think everything needs to be cleaned out for Hypre to reinstall.
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>
>>>
>>>> Please see the attachment for the log file.
>>>>
>>>> Thanks,
>>>>
>>>> Fande
>>>>
>>>> On Wed, Nov 10, 2021 at 12:16 PM Satish Balay 
>>>> wrote:
>>>>
>>>>> You are using petsc-3.15.1 - and likely the mumps build change between
>>>>> then and current 3.16.
>>>>>
>>>>> Can you use latest PETSc release?
>>>>>
>>>>> If not - Suggest removing  --download-mumps=
>>>>> https://bitbucket.org/petsc/pkg-mumps.git
>>>>> --download-mumps-commit=v5.4.1-p1 options [and PETSC_ARCH] and going back
>>>>> to your old
>>>>> build.
>>>>>
>>>>> If it fails [as before] - retry with: --with-make-np=1
>>>>>
>>>>> Satish
>>>>>
>>>>> On Wed, 10 Nov 2021, Novak, April via petsc-users wrote:
>>>>>
>>>>> > Hi Barry,
>>>>> >
>>>>> > Thank you for your assistance - I’ve attached the latest
>>>>> configure.log. I still encounter issues building, though some of the MUMPS
>>>>> errors do seem to have been fixed with the --download-mumps-commit option.
>>>>> Do you have a recommendation for addressing these other errors?
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -April
>>>>> >
>>>>> > From: Fande Kong 
>>>>> > Date: Wednesday, November 10, 2021 at 9:44 AM
>>>>> > To: Barry Smith 
>>>>> > Cc: PETSc users list , Novak, April <
>>>>> ano...@anl.gov>
>>>>> > Subject: Re: [petsc-users] Error running make on MUMPS
>>>>> > Thanks, Barry,
>>>>> >
>>>>> > We will try this, and report back
>>>>> >
>>>>> >
>>>>> > Fande
>>>>> >
>>>>> > On Tue, Nov 9, 2021 at 5:41 PM Barry Smith >>>> bsm...@petsc.dev>> wrote:
>>>>> >
>>>>> >This version of MUMPS 

Re: [petsc-users] Error running make on MUMPS

2021-11-11 Thread Fande Kong
Thanks Matt,

I understand completely, the actual error should be

"
ln -s libHYPRE_parcsr_ls-2.20.0.so libHYPRE_parcsr_ls.so gmake[1]: Leaving
directory
`/beegfs1/home/anovak/cardinal/contrib/moose/petsc/arch-moose/externalpackages/git.hypre/src/parcsr_ls'
  Error running make; make install on HYPRE: Could not execute
"['/usr/bin/gmake install']":
"

The one you saw was that hypre automatically took the second try for
parcsr_ls after the first failed gmake install. Because the first try
already did  "ln -s libHYPRE_parcsr_ls-2.20.0.so libHYPRE_parcsr_ls.so",
and then second try would see that ‘libHYPRE_parcsr_ls.so" already existed.

Thanks,
Fande

On Thu, Nov 11, 2021 at 1:29 PM Matthew Knepley  wrote:

> On Thu, Nov 11, 2021 at 3:25 PM Fande Kong  wrote:
>
>> Thanks, Satish
>>
>> "--with-make-np=1"  did help us on MUMPS, but we had new trouble with
>> hypre now.
>>
>> It is hard to understand why "gmake install" even failed.
>>
>
> Because HYPRE thinks it is better to use 'ln' than the 'install' script
> that handles things like the target existing:
>
> gmake[1]: Leaving directory
> `/beegfs1/home/anovak/cardinal/contrib/moose/petsc/arch-moose/externalpackages/git.hypre/src/parcsr_ls'
> ln: failed to create symbolic link ‘libHYPRE_parcsr_ls.so’: File exists
> gmake[1]: *** [libHYPRE_parcsr_ls.so] Error 1
> gmake: *** [all] Error 1
>
> I think everything needs to be cleaned out for Hypre to reinstall.
>
>   Thanks,
>
>  Matt
>
>
>> Please see the attachment for the log file.
>>
>> Thanks,
>>
>> Fande
>>
>> On Wed, Nov 10, 2021 at 12:16 PM Satish Balay  wrote:
>>
>>> You are using petsc-3.15.1 - and likely the mumps build change between
>>> then and current 3.16.
>>>
>>> Can you use latest PETSc release?
>>>
>>> If not - Suggest removing  --download-mumps=
>>> https://bitbucket.org/petsc/pkg-mumps.git
>>> --download-mumps-commit=v5.4.1-p1 options [and PETSC_ARCH] and going back
>>> to your old
>>> build.
>>>
>>> If it fails [as before] - retry with: --with-make-np=1
>>>
>>> Satish
>>>
>>> On Wed, 10 Nov 2021, Novak, April via petsc-users wrote:
>>>
>>> > Hi Barry,
>>> >
>>> > Thank you for your assistance - I’ve attached the latest
>>> configure.log. I still encounter issues building, though some of the MUMPS
>>> errors do seem to have been fixed with the --download-mumps-commit option.
>>> Do you have a recommendation for addressing these other errors?
>>> >
>>> > Thanks,
>>> >
>>> > -April
>>> >
>>> > From: Fande Kong 
>>> > Date: Wednesday, November 10, 2021 at 9:44 AM
>>> > To: Barry Smith 
>>> > Cc: PETSc users list , Novak, April <
>>> ano...@anl.gov>
>>> > Subject: Re: [petsc-users] Error running make on MUMPS
>>> > Thanks, Barry,
>>> >
>>> > We will try this, and report back
>>> >
>>> >
>>> > Fande
>>> >
>>> > On Tue, Nov 9, 2021 at 5:41 PM Barry Smith >> bsm...@petsc.dev>> wrote:
>>> >
>>> >This version of MUMPS has a bug in its build system; it does not
>>> have all the dependencies on Fortran modules properly listed so Fortran
>>> files can get compiled too early causing "random" failures during some
>>> builds, especially on machines with lots of cores for compiling.
>>> >
>>> >I think you should be able to use --download-mumps=
>>> https://bitbucket.org/petsc/pkg-mumps.git
>>> --download-mumps-commit=v5.4.1-p1 to get a patched version.
>>> >
>>> > Barry
>>> >
>>> >
>>> >
>>> > On Nov 9, 2021, at 6:10 PM, Fande Kong >> fdkong...@gmail.com>> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > We encountered a configuration error when running the PETSc
>>> configuration on a HPC system.  Went through the log file, but could not
>>> find much. The log file was attached.
>>> >
>>> > Any thoughts?
>>> >
>>> > Thanks for your help, as always.
>>> >
>>> > Fande
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > 
>>> >
>>> >
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


Re: [petsc-users] Error running make on MUMPS

2021-11-10 Thread Fande Kong
Thanks, Barry,

We will try this, and report back


Fande

On Tue, Nov 9, 2021 at 5:41 PM Barry Smith  wrote:

>
>This version of MUMPS has a bug in its build system; it does not have
> all the dependencies on Fortran modules properly listed so Fortran files
> can get compiled too early causing "random" failures during some builds,
> especially on machines with lots of cores for compiling.
>
>I think you should be able to use --download-mumps=
> https://bitbucket.org/petsc/pkg-mumps.git --download-mumps-commit=v5.4.1-p1
> to get a patched version.
>
> Barry
>
>
> On Nov 9, 2021, at 6:10 PM, Fande Kong  wrote:
>
> Hi All,
>
> We encountered a configuration error when running the PETSc
> configuration on a HPC system.  Went through the log file, but could not
> find much. The log file was attached.
>
> Any thoughts?
>
> Thanks for your help, as always.
>
> Fande
>
>
>
>
>
>
> 
>
>
>


Re: [petsc-users] MatZeroRows changes my sparsity pattern

2021-07-15 Thread Fande Kong
"if (a->keepnonzeropattern)" branch does not change ilen so that
A->ops->assemblyend will be fine. It would help if you made sure that
elements have been inserted for these rows before you call MatZeroRows.

However, I am not sure it is necessary to call  A->ops->assemblyend if we
already require a->keepnonzeropattern. That being said, we might have
something like this


*diff --git a/src/mat/impls/aij/seq/aij.c b/src/mat/impls/aij/seq/aij.c*

*index 42c93a82b1..3f20a599d6 100644*

*--- a/src/mat/impls/aij/seq/aij.c*

*+++ b/src/mat/impls/aij/seq/aij.c*

@@ -2203,7 +2203,9 @@ PetscErrorCode MatZeroRows_SeqAIJ(Mat A,PetscInt
N,const PetscInt rows[],PetscSc

 #if defined(PETSC_HAVE_DEVICE)

   if (A->offloadmask != PETSC_OFFLOAD_UNALLOCATED) A->offloadmask =
PETSC_OFFLOAD_CPU;

 #endif

-  ierr = (*A->ops->assemblyend)(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

+  if (!a->keepnonzeropattern) {

+ierr = (*A->ops->assemblyend)(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

+  }

   PetscFunctionReturn(0);

 }


Fande

On Thu, Jul 15, 2021 at 9:30 AM Stefano Zampini 
wrote:

> Alexander
>
> Do you have a small code to reproduce the issue?
>
> Below is the output using a PETSc example (src/mat/tests/ex11). The
> pattern is kept.
>
> kl-18448:tests szampini$ ./ex11
> Mat Object: 1 MPI processes
>   type: seqaij
> row 0: (0, 5.)
> row 1: (0, -1.)  (1, 4.)  (2, -1.)  (6, -1.)
> row 2: (2, 5.)
> row 3: (2, -1.)  (3, 4.)  (4, -1.)  (8, -1.)
> row 4: (4, 5.)
> row 5: (0, -1.)  (5, 4.)  (6, -1.)  (10, -1.)
> row 6: (6, 5.)
> row 7: (2, -1.)  (6, -1.)  (7, 4.)  (8, -1.)  (12, -1.)
> row 8: (8, 5.)
> row 9: (4, -1.)  (8, -1.)  (9, 4.)  (14, -1.)
> row 10: (10, 5.)
> row 11: (6, -1.)  (10, -1.)  (11, 4.)  (12, -1.)  (16, -1.)
> row 12: (12, 5.)
> row 13: (8, -1.)  (12, -1.)  (13, 4.)  (14, -1.)  (18, -1.)
> row 14: (14, 5.)
> row 15: (10, -1.)  (15, 4.)  (16, -1.)  (20, -1.)
> row 16: (16, 5.)
> row 17: (12, -1.)  (16, -1.)  (17, 4.)  (18, -1.)  (22, -1.)
> row 18: (18, 5.)
> row 19: (14, -1.)  (18, -1.)  (19, 4.)  (24, -1.)
> row 20: (20, 5.)
> row 21: (16, -1.)  (20, -1.)  (21, 4.)  (22, -1.)
> row 22: (22, 5.)
> row 23: (18, -1.)  (22, -1.)  (23, 4.)  (24, -1.)
> row 24: (19, -1.)  (23, -1.)  (24, 4.)
> kl-18448:tests szampini$ ./ex11 -keep_nonzero_pattern
> Mat Object: 1 MPI processes
>   type: seqaij
> row 0: (0, 5.)  (1, 0.)  (5, 0.)
> row 1: (0, -1.)  (1, 4.)  (2, -1.)  (6, -1.)
> row 2: (1, 0.)  (2, 5.)  (3, 0.)  (7, 0.)
> row 3: (2, -1.)  (3, 4.)  (4, -1.)  (8, -1.)
> row 4: (3, 0.)  (4, 5.)  (9, 0.)
> row 5: (0, -1.)  (5, 4.)  (6, -1.)  (10, -1.)
> row 6: (1, 0.)  (5, 0.)  (6, 5.)  (7, 0.)  (11, 0.)
> row 7: (2, -1.)  (6, -1.)  (7, 4.)  (8, -1.)  (12, -1.)
> row 8: (3, 0.)  (7, 0.)  (8, 5.)  (9, 0.)  (13, 0.)
> row 9: (4, -1.)  (8, -1.)  (9, 4.)  (14, -1.)
> row 10: (5, 0.)  (10, 5.)  (11, 0.)  (15, 0.)
> row 11: (6, -1.)  (10, -1.)  (11, 4.)  (12, -1.)  (16, -1.)
> row 12: (7, 0.)  (11, 0.)  (12, 5.)  (13, 0.)  (17, 0.)
> row 13: (8, -1.)  (12, -1.)  (13, 4.)  (14, -1.)  (18, -1.)
> row 14: (9, 0.)  (13, 0.)  (14, 5.)  (19, 0.)
> row 15: (10, -1.)  (15, 4.)  (16, -1.)  (20, -1.)
> row 16: (11, 0.)  (15, 0.)  (16, 5.)  (17, 0.)  (21, 0.)
> row 17: (12, -1.)  (16, -1.)  (17, 4.)  (18, -1.)  (22, -1.)
> row 18: (13, 0.)  (17, 0.)  (18, 5.)  (19, 0.)  (23, 0.)
> row 19: (14, -1.)  (18, -1.)  (19, 4.)  (24, -1.)
> row 20: (15, 0.)  (20, 5.)  (21, 0.)
> row 21: (16, -1.)  (20, -1.)  (21, 4.)  (22, -1.)
> row 22: (17, 0.)  (21, 0.)  (22, 5.)  (23, 0.)
> row 23: (18, -1.)  (22, -1.)  (23, 4.)  (24, -1.)
> row 24: (19, -1.)  (23, -1.)  (24, 4.)
>
> On Jul 15, 2021, at 4:41 PM, Alexander Lindsay 
> wrote:
>
> My interpretation of the documentation page of MatZeroRows is that if I've
> set MAT_KEEP_NONZERO_PATTERN to true, then my sparsity pattern shouldn't be
> changed by a call to it, e.g. a->imax should not change. However, at least
> for sequential matrices, MatAssemblyEnd is called with MAT_FINAL_ASSEMBLY
> at the end of MatZeroRows_SeqAIJ and that does indeed change my sparsity
> pattern. Is my interpretation of the documentation page wrong?
>
> Alex
>
>
>


Re: [petsc-users] MUMPS failure

2021-03-27 Thread Fande Kong
There are some statements from MUMPS user manual
http://mumps.enseeiht.fr/doc/userguide_5.3.5.pdf

"
A full 64-bit integer version can be obtained compiling MUMPS with C
preprocessing flag -DINTSIZE64 and Fortran compiler option -i8,
-fdefault-integer-8 or something equivalent depending on your compiler, and
compiling all libraries including MPI, BLACS, ScaLAPACK, LAPACK and BLAS
also with 64-bit integers. We refer the reader to the “INSTALL” file
provided with the package for details and explanations of the compilation
flags controlling integer sizes.
"

It seems possible to build a full-64-bit-integer version of MUMPS. However,
I do not understand how to build MPI with 64-bit integer support. From my
understanding, MPI is hard coded with an integer type (int), and there is
no way to make "int" become "long" .


Thanks,

Fande


On Tue, Mar 23, 2021 at 12:20 PM Sanjay Govindjee  wrote:

> I agree.  If you are mixing C and Fortran, everything is *nota bene.  *It
> is easy to miss argument mismatches.
> -sanjay
>
> On 3/23/21 11:04 AM, Barry Smith wrote:
>
>
>In a pure Fortran code using -fdefault-integer-8 is probably fine. But
> MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C interface.
> The  -fdefault-integer-8 doesn't magically fix anything in the C parts of
> MUMPS.  I also don't know about MPI calls and if they would need editing.
>
>I am not saying it is impossible to get it to work but one needs are to
> insure the C portions also switch to 64 bit integers in a consistent way.
> This may be all doable bit is not simply using -fdefault-integer-8 on MUMPS.
>
>   Barry
>
>
> On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee  wrote:
>
> Barry,
> I am curious about your statement "does not work generically".  If I
> compile with -fdefault-integer-8,
> I would assume that this produces objects/libraries that will use 64bit
> integers.  As long as I have not declared
> explicit kind=4 integers, what else could go wrong.
> -sanjay
>
> PS: I am not advocating this as a great idea, but I am curious if there or
> other obscure compiler level things that could go wrong.
>
>
> On 3/22/21 8:53 PM, Barry Smith wrote:
>
>
>
> On Mar 22, 2021, at 3:24 PM, Junchao Zhang 
> wrote:
>
>
>
>
> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith  wrote:
>
>>
>>Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago
>> that produced error messages as below. Please confirm you are using the
>> latest PETSc and MUMPS.
>>
>>You can run your production version with the option -malloc_debug ;
>> this will slow it down a bit but if there is memory corruption it may
>> detect it and indicate the problematic error.
>>
>> One also has to be careful about the size of the problem passed to
>> MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers.
>> Is it only crashing for problems near 2 billion entries in the sparse
>> matrix?
>>
>  "problems near 2 billion entries"?  I don't understand. Should not be an
> issue if building petsc with 64-bit indices.
>
>
>   MUMPS does not have proper support for 64 bit indices. It relies on
> add-hoc Fortran compiler command line options to support to converting
> integer to 64 bit integers and does not work generically. Yes, Fortran
> lovers have been doing this for 30 years inside their applications but it
> does not really work in a library environment. But then a big feature of
> Fortran is "who needs libraries, we just write all the code we need"
> (except Eispack,Linpack,LAPACK :=-).
>
>
>
>>  valgrind is the gold standard for detecting memory corruption.
>>
>> Barry
>>
>>
>> On Mar 22, 2021, at 12:56 PM, Chris Hewson  wrote:
>>
>> Hi All,
>>
>> I have been having a problem with MUMPS randomly crashing in our program
>> and causing the entire program to crash. I am compiling in -O2 optimization
>> mode and using --download-mumps etc. to compile PETSc. If I rerun the
>> program, 95%+ of the time I can't reproduce the error. It seems to be a
>> similar issue to this thread:
>>
>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html
>>
>> Similar to the resolution there I am going to try and increase icntl_14
>> and see if that resolves the issue. Any other thoughts on this?
>>
>> Thanks,
>>
>> *Chris Hewson*
>> Senior Reservoir Simulation Engineer
>> ResFrac
>> +1.587.575.9792
>>
>>
>>
>
>
>
>


Re: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library!

2021-03-10 Thread Fande Kong
Thanks Barry,

Your branch works very well. Thanks for your help!!!

Could you merge it to upstream?

Fande

On Wed, Mar 10, 2021 at 6:30 PM Barry Smith  wrote:

>
>   Fande,
>
>  Before send the files I requested in my last email could you try with
> the branch *barry/2021-03-10/handle-pie-flag-conda/release *and send its
> configure.log if it fails.
>
>Thanks
>
> Barry
>
>
> On Mar 10, 2021, at 5:59 PM, Fande Kong  wrote:
>
> Do not know what the fix should look like, but this works for me
>
>
>  @staticmethod
> @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure):
>  output.find('unrecognized command line option') >= 0 or
> output.find('unrecognized option') >= 0 or output.find('unrecognised
> option') >= 0 or
>  output.find('not recognized') >= 0 or output.find('not recognised')
> >= 0 or
>  output.find('unknown option') >= 0 or output.find('unknown flag') >=
> 0 or output.find('Unknown switch') >= 0 or
> -output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or
>  output.find('argument unused') >= 0 or output.find('not supported')
> >= 0 or
>  # When checking for the existence of 'attribute'
>      output.find('is unsupported and will be skipped') >= 0 or
>
>
>
> Thanks,
>
> Fande
>
> On Wed, Mar 10, 2021 at 4:21 PM Fande Kong  wrote:
>
>>
>>
>> On Wed, Mar 10, 2021 at 1:36 PM Satish Balay  wrote:
>>
>>> Can you use a different MPI for this conda install?
>>>
>>
>> We control how to build MPI. If I take "-pie" options out of LDFLAGS,
>> conda can not compile mpich.
>>
>>
>>
>>
>>>
>>> Alternative:
>>>
>>> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2
>>> -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include
>>>  LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
>>> -Wl,-commons,use_dylibs"
>>> LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi"
>>>
>>
>> MPI can not generate an executable because we took out "-pie".
>>
>> Thanks,
>>
>> Fande
>>
>>
>>>
>>> etc.. [don't know if you really need LDFLAGS options]
>>>
>>> Satish
>>>
>>> On Wed, 10 Mar 2021, Fande Kong wrote:
>>>
>>> > I guess it was encoded in mpicc
>>> >
>>> > petsc % mpicc -show
>>> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie
>>> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
>>> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib
>>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs
>>> > -I/Users/kongf/miniconda3/envs/testpetsc/include
>>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Fande
>>> >
>>> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay 
>>> wrote:
>>> >
>>> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs
>>> -rpath
>>> > > /Users/kongf/miniconda3/envs/testpetsc/lib
>>> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib
>>> > >
>>> > > Does conda compiler pick up '-pie' from this env variable? If so -
>>> perhaps
>>> > > its easier to just modify it?
>>> > >
>>> > > Or is it encoded in mpicc wrapper? [mpicc -show]
>>> > >
>>> > > Satish
>>> > >
>>> > > On Wed, 10 Mar 2021, Fande Kong wrote:
>>> > >
>>> > > > Thanks Barry,
>>> > > >
>>> > > > Got the same result, but  "-pie" was not filtered out somehow.
>>> > > >
>>> > > > I did changes like this:
>>> > > >
>>> > > > kongf@x86_64-apple-darwin13 petsc % git diff
>>> > > > diff --git a/config/BuildSystem/config/framework.py
>>> > > > b/config/BuildSystem/config/framework.py
>>> > > > index beefe82956..c31fbeb95e 100644
>>> > > > --- a/config/BuildSystem/config/framework.py
>>> > > > +++ b/config/BuildSystem/config/framework.py
>>> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure,
>>> > > > script.LanguageProcessor):
>>> > 

Re: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library!

2021-03-10 Thread Fande Kong
Do not know what the fix should look like, but this works for me


 @staticmethod
@@ -1194,7 +1194,6 @@ class Configure(config.base.Configure):
 output.find('unrecognized command line option') >= 0 or
output.find('unrecognized option') >= 0 or output.find('unrecognised
option') >= 0 or
 output.find('not recognized') >= 0 or output.find('not recognised') >=
0 or
 output.find('unknown option') >= 0 or output.find('unknown flag') >= 0
or output.find('Unknown switch') >= 0 or
-output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or
 output.find('argument unused') >= 0 or output.find('not supported') >=
0 or
 # When checking for the existence of 'attribute'
 output.find('is unsupported and will be skipped') >= 0 or



Thanks,

Fande

On Wed, Mar 10, 2021 at 4:21 PM Fande Kong  wrote:

>
>
> On Wed, Mar 10, 2021 at 1:36 PM Satish Balay  wrote:
>
>> Can you use a different MPI for this conda install?
>>
>
> We control how to build MPI. If I take "-pie" options out of LDFLAGS,
> conda can not compile mpich.
>
>
>
>
>>
>> Alternative:
>>
>> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2
>> -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include
>>  LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
>> -Wl,-commons,use_dylibs"
>> LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi"
>>
>
> MPI can not generate an executable because we took out "-pie".
>
> Thanks,
>
> Fande
>
>
>>
>> etc.. [don't know if you really need LDFLAGS options]
>>
>> Satish
>>
>> On Wed, 10 Mar 2021, Fande Kong wrote:
>>
>> > I guess it was encoded in mpicc
>> >
>> > petsc % mpicc -show
>> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie
>> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
>> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib
>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs
>> > -I/Users/kongf/miniconda3/envs/testpetsc/include
>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi
>> >
>> >
>> > Thanks,
>> >
>> > Fande
>> >
>> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay 
>> wrote:
>> >
>> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs
>> -rpath
>> > > /Users/kongf/miniconda3/envs/testpetsc/lib
>> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib
>> > >
>> > > Does conda compiler pick up '-pie' from this env variable? If so -
>> perhaps
>> > > its easier to just modify it?
>> > >
>> > > Or is it encoded in mpicc wrapper? [mpicc -show]
>> > >
>> > > Satish
>> > >
>> > > On Wed, 10 Mar 2021, Fande Kong wrote:
>> > >
>> > > > Thanks Barry,
>> > > >
>> > > > Got the same result, but  "-pie" was not filtered out somehow.
>> > > >
>> > > > I did changes like this:
>> > > >
>> > > > kongf@x86_64-apple-darwin13 petsc % git diff
>> > > > diff --git a/config/BuildSystem/config/framework.py
>> > > > b/config/BuildSystem/config/framework.py
>> > > > index beefe82956..c31fbeb95e 100644
>> > > > --- a/config/BuildSystem/config/framework.py
>> > > > +++ b/config/BuildSystem/config/framework.py
>> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure,
>> > > > script.LanguageProcessor):
>> > > > lines = [s for s in lines if s.find('Load a valid targeting
>> module or
>> > > > set CRAY_CPU_TARGET') < 0]
>> > > > # pgi dumps filename on stderr - but returns 0 errorcode'
>> > > > lines = [s for s in lines if lines != 'conftest.c:']
>> > > > +   # in case -pie is always being passed to linker
>> > > > +   lines = [s for s in lines if s.find('-pie being ignored. It is
>> only
>> > > > used when linking a main executable') < 0]
>> > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n')
>> > > > else: output = ''
>> > > > log.write("Linker stderr after filtering:\n"+output+":\n")
>> > > >
>> > > > The log was attached again.
>> > > >
>> > > > Th

Re: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library!

2021-03-10 Thread Fande Kong
On Wed, Mar 10, 2021 at 1:36 PM Satish Balay  wrote:

> Can you use a different MPI for this conda install?
>

We control how to build MPI. If I take "-pie" options out of LDFLAGS, conda
can not compile mpich.




>
> Alternative:
>
> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2
> -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include
>  LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
> -Wl,-commons,use_dylibs"
> LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi"
>

MPI can not generate an executable because we took out "-pie".

Thanks,

Fande


>
> etc.. [don't know if you really need LDFLAGS options]
>
> Satish
>
> On Wed, 10 Mar 2021, Fande Kong wrote:
>
> > I guess it was encoded in mpicc
> >
> > petsc % mpicc -show
> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie
> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib
> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs
> > -I/Users/kongf/miniconda3/envs/testpetsc/include
> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi
> >
> >
> > Thanks,
> >
> > Fande
> >
> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay  wrote:
> >
> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs
> -rpath
> > > /Users/kongf/miniconda3/envs/testpetsc/lib
> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib
> > >
> > > Does conda compiler pick up '-pie' from this env variable? If so -
> perhaps
> > > its easier to just modify it?
> > >
> > > Or is it encoded in mpicc wrapper? [mpicc -show]
> > >
> > > Satish
> > >
> > > On Wed, 10 Mar 2021, Fande Kong wrote:
> > >
> > > > Thanks Barry,
> > > >
> > > > Got the same result, but  "-pie" was not filtered out somehow.
> > > >
> > > > I did changes like this:
> > > >
> > > > kongf@x86_64-apple-darwin13 petsc % git diff
> > > > diff --git a/config/BuildSystem/config/framework.py
> > > > b/config/BuildSystem/config/framework.py
> > > > index beefe82956..c31fbeb95e 100644
> > > > --- a/config/BuildSystem/config/framework.py
> > > > +++ b/config/BuildSystem/config/framework.py
> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure,
> > > > script.LanguageProcessor):
> > > > lines = [s for s in lines if s.find('Load a valid targeting
> module or
> > > > set CRAY_CPU_TARGET') < 0]
> > > > # pgi dumps filename on stderr - but returns 0 errorcode'
> > > > lines = [s for s in lines if lines != 'conftest.c:']
> > > > +   # in case -pie is always being passed to linker
> > > > +   lines = [s for s in lines if s.find('-pie being ignored. It is
> only
> > > > used when linking a main executable') < 0]
> > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n')
> > > > else: output = ''
> > > > log.write("Linker stderr after filtering:\n"+output+":\n")
> > > >
> > > > The log was attached again.
> > > >
> > > > Thanks,
> > > >
> > > > Fande
> > > >
> > > >
> > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith 
> wrote:
> > > >
> > > > >  Fande,
> > > > >
> > > > >Please add in config/BuildSystem/config/framework.py line 528
> two
> > > new
> > > > > lines
> > > > >
> > > > >   # pgi dumps filename on stderr - but returns 0 errorcode'
> > > > >   lines = [s for s in lines if lines != 'conftest.c:']
> > > > >   # in case -pie is always being passed to linker
> > > > >   lines = [s for s in lines if s.find('-pie being ignored. It
> is
> > > only
> > > > > used when linking a main executable') < 0]
> > > > >
> > > > >Barry
> > > > >
> > > > >You have (another of Conda's "take over the world my way"
> approach)
> > > > >
> > > > >LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs
> > > -rpath
> > > > > /Users/kongf/miniconda3/envs/testpetsc/lib
> > > > > -L/Users/kongf/mi

Re: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library!

2021-03-10 Thread Fande Kong
On Wed, Mar 10, 2021 at 12:05 PM Barry Smith  wrote:

>  Fande,
>
>Please add in config/BuildSystem/config/framework.py line 528 two new
> lines
>
>   # pgi dumps filename on stderr - but returns 0 errorcode'
>   lines = [s for s in lines if lines != 'conftest.c:']
>   # in case -pie is always being passed to linker
>   lines = [s for s in lines if s.find('-pie being ignored. It is only
> used when linking a main executable') < 0]
>
>Barry
>
>You have (another of Conda's "take over the world my way" approach)
>
>LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath
> /Users/kongf/miniconda3/envs/testpetsc/lib
> -L/Users/kongf/miniconda3/envs/testpetsc/lib
>
> Executing: mpicc  -o
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest
>  -dynamiclib -single_module
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest.o
> Possible ERROR while running linker:
> stderr:
> ld: warning: -pie being ignored. It is only used when linking a main
> executable
> Rejecting C linker flag -dynamiclib -single_module due to
>
> ld: warning: -pie being ignored. It is only used when linking a main
> executable
>
> This is the correct link command for the Mac but it is being rejected due
> to the warning message.
>

Could we somehow skip warning messages?



Fande


>
>
> On Mar 10, 2021, at 10:11 AM, Fande Kong  wrote:
>
> Thanks, Barry,
>
> It seems PETSc works fine with manually built compilers. We are pretty
> much sure that the issue is related to conda. Conda might introduce extra
> flags.
>
> We still need to make it work with conda because we deliver our package
> via conda for users.
>
>
> I unset all flags from conda, and got slightly different results this
> time.  The log was attached. Anyone could  explain the motivation that we
> try to build executable without a main function?
>
> Thanks,
>
> Fande
>
> Executing: mpicc -c -o
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest.o
> -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers
>  -fPIC
>  
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest.c
>
> Successful compile:
> Source:
> #include "confdefs.h"
> #include "conffix.h"
> #include 
> int (*fprintf_ptr)(FILE*,const char*,...) = fprintf;
> void  foo(void){
>   fprintf_ptr(stdout,"hello");
>   return;
> }
> void bar(void){foo();}
> Running Executable WITHOUT threads to time it out
> Executing: mpicc  -o
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/libconftest.so
>  -dynamic  -fPIC
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest.o
>
> Possible ERROR while running linker: exit code 1
> stderr:
> Undefined symbols for architecture x86_64:
>   "_main", referenced from:
>  implicit entry/start for main executable
> ld: symbol(s) not found for architecture x86_64
> clang-11: error: linker command failed with exit code 1 (use -v to see
> invocation)
>   Rejected C compiler flag -fPIC because it was not compatible
> with shared linker mpicc using flags ['-dynamic']
>
>
> On Mon, Mar 8, 2021 at 7:28 PM Barry Smith  wrote:
>
>>
>>   Fande,
>>
>>  I see you are using CONDA, this can cause issues since it sticks all
>> kinds of things into the environment. PETSc tries to remove some of them
>> but perhaps not enough. If you run printenv you will see all the mess it is
>> dumping in.
>>
>> Can you trying the same build without CONDA environment?
>>
>>   Barry
>>
>>
>> On Mar 8, 2021, at 7:31 PM, Matthew Knepley  wrote:
>>
>> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong  wrote:
>>
>>> Thanks Matthew,
>>>
>>> Hmm, we still have the same issue after shutting off all unknown flags.
>>>
>>
>> Oh, I was misinterpreting the error message:
>>
>>   ld: can't link with a main executable file
>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib'
>>
>> So clang did not _actually_ make a shared library, it made an executable.
>> Did clang-11 change the options it uses to build a shared library?
>>
>> Satish, do we test with clang-11?
>>
>>   Thanks,
>>
>>   Matt
>>
>> Thanks,
>>>
>>> Fande
>>>
>>> On Mon, Mar 8, 2

Re: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library!

2021-03-10 Thread Fande Kong
I guess it was encoded in mpicc

petsc % mpicc -show
x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie
-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs
-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib
-L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs
-I/Users/kongf/miniconda3/envs/testpetsc/include
-L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi


Thanks,

Fande

On Wed, Mar 10, 2021 at 12:51 PM Satish Balay  wrote:

> > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath
> /Users/kongf/miniconda3/envs/testpetsc/lib
> -L/Users/kongf/miniconda3/envs/testpetsc/lib
>
> Does conda compiler pick up '-pie' from this env variable? If so - perhaps
> its easier to just modify it?
>
> Or is it encoded in mpicc wrapper? [mpicc -show]
>
> Satish
>
> On Wed, 10 Mar 2021, Fande Kong wrote:
>
> > Thanks Barry,
> >
> > Got the same result, but  "-pie" was not filtered out somehow.
> >
> > I did changes like this:
> >
> > kongf@x86_64-apple-darwin13 petsc % git diff
> > diff --git a/config/BuildSystem/config/framework.py
> > b/config/BuildSystem/config/framework.py
> > index beefe82956..c31fbeb95e 100644
> > --- a/config/BuildSystem/config/framework.py
> > +++ b/config/BuildSystem/config/framework.py
> > @@ -504,6 +504,8 @@ class Framework(config.base.Configure,
> > script.LanguageProcessor):
> > lines = [s for s in lines if s.find('Load a valid targeting module or
> > set CRAY_CPU_TARGET') < 0]
> > # pgi dumps filename on stderr - but returns 0 errorcode'
> > lines = [s for s in lines if lines != 'conftest.c:']
> > +   # in case -pie is always being passed to linker
> > +   lines = [s for s in lines if s.find('-pie being ignored. It is only
> > used when linking a main executable') < 0]
> > if lines: output = reduce(lambda s, t: s+t, lines, '\n')
> > else: output = ''
> > log.write("Linker stderr after filtering:\n"+output+":\n")
> >
> > The log was attached again.
> >
> > Thanks,
> >
> > Fande
> >
> >
> > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith  wrote:
> >
> > >  Fande,
> > >
> > >Please add in config/BuildSystem/config/framework.py line 528 two
> new
> > > lines
> > >
> > >   # pgi dumps filename on stderr - but returns 0 errorcode'
> > >   lines = [s for s in lines if lines != 'conftest.c:']
> > >   # in case -pie is always being passed to linker
> > >   lines = [s for s in lines if s.find('-pie being ignored. It is
> only
> > > used when linking a main executable') < 0]
> > >
> > >Barry
> > >
> > >You have (another of Conda's "take over the world my way" approach)
> > >
> > >LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs
> -rpath
> > > /Users/kongf/miniconda3/envs/testpetsc/lib
> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib
> > >
> > > Executing: mpicc  -o
> > >
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest
> > >  -dynamiclib -single_module
> > >
> /var/folders/tv/ljnkj46x3nq45cp9tbkc000cgn/T/petsc-pkset22y/config.setCompilers/conftest.o
> > > Possible ERROR while running linker:
> > > stderr:
> > > ld: warning: -pie being ignored. It is only used when linking a main
> > > executable
> > > Rejecting C linker flag -dynamiclib -single_module due to
> > >
> > > ld: warning: -pie being ignored. It is only used when linking a main
> > > executable
> > >
> > > This is the correct link command for the Mac but it is being rejected
> due
> > > to the warning message.
> > >
> > >
> > > On Mar 10, 2021, at 10:11 AM, Fande Kong  wrote:
> > >
> > > Thanks, Barry,
> > >
> > > It seems PETSc works fine with manually built compilers. We are pretty
> > > much sure that the issue is related to conda. Conda might introduce
> extra
> > > flags.
> > >
> > > We still need to make it work with conda because we deliver our package
> > > via conda for users.
> > >
> > >
> > > I unset all flags from conda, and got slightly different results this
> > > time.  The log was attached. Anyone could  explain the motivation that
> we
> > > try to build executable without a main function?
> > >
> > > Thanks,
> > >
> > > Fan

Re: [petsc-users] PetscAllreduceBarrierCheck is valgrind clean?

2021-01-13 Thread Fande Kong
On Wed, Jan 13, 2021 at 11:49 AM Barry Smith  wrote:

>
>   Fande,
>
>  Look at
> https://scm.mvapich.cse.ohio-state.edu/svn/mpi/mvapich2/trunk/src/mpid/ch3/channels/common/src/detect/arch/mv2_arch_detect.c
>
>  cpubind_set = hwloc_bitmap_alloc();
>
>  but I don't find a corresponding hwloc_bitmap_free(cpubind_set ); in
> get_socket_bound_info().
>

Thanks. I added hwloc_bitmap_free(cpubind_set ) to the end
of get_socket_bound_info(). And then these  valgrind messages disappeared.

Will ask mvapich developers to fix this.

Thanks,

Fande,


>
>
>   Barry
>
>
> >
>
> > On Jan 13, 2021, at 12:32 PM, Fande Kong  wrote:
> >
> > Hi All,
> >
> > I ran valgrind with mvapich-2.3.5 for a moose simulation.  The
> motivation was that we have a few non-deterministic parallel simulations in
> moose. I want to check if we have any memory issues. I got some complaints
> from PetscAllreduceBarrierCheck
> >
> > Thanks,
> >
> >
> > Fande
> >
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 31 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
> (ch3_shmem_coll.c:2183)
> > ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> > ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> > ==98001==by 0x99F70BE: PetscSplitOwnership (psplit.c:84)
> > ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> > ==98001==by 0xA08C66B: MatMPIAdjSetPreallocation_MPIAdj
> (mpiadj.c:630)
> > ==98001==by 0xA08EB9A: MatMPIAdjSetPreallocation (mpiadj.c:856)
> > ==98001==by 0xA08F6D3: MatCreateMPIAdj (mpiadj.c:904)
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 32 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A9A4: mv2_increment_allgather_coll_counter
> (ch3_shmem_coll.c:2218)
> > ==98001==by 0xD4E4CE4: PMPI_Allreduce (allreduce.c:917)
> > ==98001==by 0xCD9D74D: libparmetis__gkMPI_Allreduce (gkmpi.c:103)
> > ==98001==by 0xCDBB663: libparmetis__ComputeParallelBalance
> (stat.c:87)
> > ==98001==by 0xCDA4FE0: libparmetis__KWayFM (kwayrefine.c:352)
> > ==98001==by 0xCDA21ED: libparmetis__Global_Partition (kmetis.c:222)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA2748: ParMETIS_V3_PartKway (kmetis.c:94)
> > ==98001==by 0xA2D6B39: MatPartitioningApply_Parmetis_Private
> (pmetis.c:145)
> > ==98001==by 0xA2D77D9: MatPartitioningApply_Parmetis (pmetis.c:219)
> > ==98001==by 0xA2CD46A: MatPartitioningApply (partition.c:332)
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 33 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
> (ch3_shmem_coll.c:2183)
> > ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> > ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> > ==98001==by 0x99F733E: PetscSplitOwnership (psplit.c:91)
> > ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> > ==98001==by 0x9C5DB0D: PetscLayoutCreateFromSizes (pmap.c:112)
> > ==98001==by 0x9D9A018: ISGeneralSetIndices_Gen

[petsc-users] PetscAllreduceBarrierCheck is valgrind clean?

2021-01-13 Thread Fande Kong
Hi All,

I ran valgrind with mvapich-2.3.5 for a moose simulation.  The motivation
was that we have a few non-deterministic parallel simulations in moose. I
want to check if we have any memory issues. I got some complaints from
PetscAllreduceBarrierCheck

Thanks,


Fande



==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost
in loss record 31 of 54
==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
(ch3_shmem_coll.c:2183)
==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
==98001==by 0x99F70BE: PetscSplitOwnership (psplit.c:84)
==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
==98001==by 0xA08C66B: MatMPIAdjSetPreallocation_MPIAdj (mpiadj.c:630)
==98001==by 0xA08EB9A: MatMPIAdjSetPreallocation (mpiadj.c:856)
==98001==by 0xA08F6D3: MatCreateMPIAdj (mpiadj.c:904)


==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost
in loss record 32 of 54
==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
==98001==by 0xD59A9A4: mv2_increment_allgather_coll_counter
(ch3_shmem_coll.c:2218)
==98001==by 0xD4E4CE4: PMPI_Allreduce (allreduce.c:917)
==98001==by 0xCD9D74D: libparmetis__gkMPI_Allreduce (gkmpi.c:103)
==98001==by 0xCDBB663: libparmetis__ComputeParallelBalance (stat.c:87)
==98001==by 0xCDA4FE0: libparmetis__KWayFM (kwayrefine.c:352)
==98001==by 0xCDA21ED: libparmetis__Global_Partition (kmetis.c:222)
==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
==98001==by 0xCDA2748: ParMETIS_V3_PartKway (kmetis.c:94)
==98001==by 0xA2D6B39: MatPartitioningApply_Parmetis_Private
(pmetis.c:145)
==98001==by 0xA2D77D9: MatPartitioningApply_Parmetis (pmetis.c:219)
==98001==by 0xA2CD46A: MatPartitioningApply (partition.c:332)


==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost
in loss record 33 of 54
==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
(ch3_shmem_coll.c:2183)
==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
==98001==by 0x99F733E: PetscSplitOwnership (psplit.c:91)
==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
==98001==by 0x9C5DB0D: PetscLayoutCreateFromSizes (pmap.c:112)
==98001==by 0x9D9A018: ISGeneralSetIndices_General (general.c:568)
==98001==by 0x9D9AB44: ISGeneralSetIndices (general.c:554)
==98001==by 0x9D9ADC4: ISCreateGeneral (general.c:529)
==98001==by 0x9B431E6: VecCreateGhostWithArray (pbvec.c:692)
==98001==by 0x9B43A33: VecCreateGhost (pbvec.c:748)


==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost
in loss record 34 of 54
==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
(ch3_shmem_coll.c:2183)
==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
==98001==by 0x9B0B5F3: VecSetSizes (vector.c:1318)
==98001==by 0x9B42DDC: VecCreateMPIWithArray (pbvec.c:625)
==98001==by 0xA7CF280: PCSetUp_Redundant (redundant.c:125)
==98001==by 0xA7BB0CE: PCSetUp (precon.c:1009)
==98001==by 0xAA2A9B9: KSPSetUp (itfunc.c:406)
==98001==by 0xA92C490: PCSetUp_MG (mg.c:907)
==98001==by 0xA93CAE9: PCSetUp_HMG (hmg.c:220)
==98001==by 0xA7BB0CE: PCSetUp (precon.c:1009)
==98001==by 0xAA2A9B9: KSPSetUp (itfunc.c:406)
==98001==by 0xAA2B2E9: KSPSolve_Private (itfunc.c:658)

Re: [petsc-users] counter->tag = *maxval - 128

2021-01-13 Thread Fande Kong
On Tue, Jan 12, 2021 at 6:49 PM Barry Smith  wrote:

>
>Fande,
>
>/* hope that any still active tags were issued right at the beginning
> of the run */
>
>PETSc actually starts with *maxval (see line 130). It is only when it
> runs out that it does this silly thing for the reason indicated in the
> comment.
>
>PETSc should actually keep track which of which tags have been
> "returned" and if the counter gets to zero use those returned tags instead
> of starting again at the top which could clash with the same value used for
> another reason. In other words the current code is buggy,  but it has
> always been "good enough".
>

I agreed that it is "good enough" for most people for most cases. However,
I was worried that there is almost no way to debug once we reuse active
tags. At least it is not easy to debug.

Thanks,

Fande


>
>   Barry
>
>
>
> On Jan 12, 2021, at 10:41 AM, Fande Kong  wrote:
>
> Hi All,
>
> I am curious about why we subtract 128 from the max value of tag? Can we
> directly use the max tag value?
>
> Thanks,
>
> Fande,
>
>
> PetscErrorCode  PetscCommGetNewTag(MPI_Comm comm,PetscMPIInt *tag)
> {
>   PetscErrorCode   ierr;
>   PetscCommCounter *counter;
>PetscMPIInt  *maxval,flg;
>
>
>   MPI_Comm_get_attr(comm,Petsc_Counter_keyval,,);
>   if (!flg) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_CORRUPT,"Bad MPI
> communicator supplied; must be a PETSc communicator");
>
>  if (counter->tag < 1) {
>   PetscInfo1(NULL,"Out of tags for object, starting to recycle. Comm
> reference count %d\n",counter->refcount);
>   MPI_Comm_get_attr(MPI_COMM_WORLD,MPI_TAG_UB,,);
> if (!flg) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_LIB,"MPI error:
> MPI_Comm_get_attr() is not returning a MPI_TAG_UB");
> counter->tag = *maxval - 128; /* hope that any still active tags were
> issued right at the beginning of the run */
>   }
>
>   *tag = counter->tag--;
>if (PetscDefined(USE_DEBUG)) {
>  /*
>  Hanging here means that some processes have called
> PetscCommGetNewTag() and others have not.
>   */
> MPI_Barrier(comm);
>   }
>   return(0);
> }
>
>
>


[petsc-users] counter->tag = *maxval - 128

2021-01-12 Thread Fande Kong
Hi All,

I am curious about why we subtract 128 from the max value of tag? Can we
directly use the max tag value?

Thanks,

Fande,


PetscErrorCode  PetscCommGetNewTag(MPI_Comm comm,PetscMPIInt *tag)
{
  PetscErrorCode   ierr;
  PetscCommCounter *counter;
   PetscMPIInt  *maxval,flg;


  MPI_Comm_get_attr(comm,Petsc_Counter_keyval,,);
  if (!flg) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_CORRUPT,"Bad MPI
communicator supplied; must be a PETSc communicator");

 if (counter->tag < 1) {
  PetscInfo1(NULL,"Out of tags for object, starting to recycle. Comm
reference count %d\n",counter->refcount);
  MPI_Comm_get_attr(MPI_COMM_WORLD,MPI_TAG_UB,,);
if (!flg) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_LIB,"MPI error:
MPI_Comm_get_attr() is not returning a MPI_TAG_UB");
counter->tag = *maxval - 128; /* hope that any still active tags were
issued right at the beginning of the run */
  }

  *tag = counter->tag--;
   if (PetscDefined(USE_DEBUG)) {
 /*
 Hanging here means that some processes have called
PetscCommGetNewTag() and others have not.
  */
MPI_Barrier(comm);
  }
  return(0);
}


Re: [petsc-users] valgrind with petscmpiexec

2020-12-15 Thread Fande Kong
Thanks so much, Satish,



On Tue, Dec 15, 2020 at 9:33 AM Satish Balay via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> For one - I think using '--log-file=valgrind-%q{HOSTNAME}-%p.log' might
> help [to keep the logs from each process separate]
>
> And I think the TMPDIR recommendation is to have a different value for
> each of the nodes [where the "pid" clash comes from] and perhaps
> "TMPDIR=/tmp" might work


"TMPDIR=/tmp" worked out.


Fande


> - as this would be local disk on each node [vs /var/tmp/ - which is
> probably a shared TMP across nodes]
>
> But then - PBS or this MPI requires a shared TMP?
>
> Satish
>
> On Tue, 15 Dec 2020, Yaqi Wang wrote:
>
> > Fande,
> >
> > Did you try set TMPDIR for valgrind?
> >
> > Sent from my iPhone
> >
> > > On Dec 15, 2020, at 1:23 AM, Barry Smith  wrote:
> > >
> > >
> > >   No idea. Perhaps petscmpiexec could be modified so it only ran
> valgrind on the first 10 ranks? Not clear how to do that. Or valgrind
> should get a MR that removes this small arbitrary limitation on the number
> of processes. 576 is so 2000 :-)
> > >
> > >
> > >   Barry
> > >
> > >
> > >> On Dec 14, 2020, at 11:59 PM, Fande Kong  wrote:
> > >>
> > >> Hi All,
> > >>
> > >> I tried to use valgrind to check if the simulation is valgrind clean
> because I saw some random communication fails during the simulation.
> > >>
> > >> I tried this command-line
> > >>
> > >> petscmpiexec -valgrind -n 576  ../../../moose-app-oprof  -i input.i
> -log_view -snes_view
> > >>
> > >>
> > >> But I got the following error messages:
> > >>
> > >> valgrind: Unable to start up properly.  Giving up.
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_8c3fabf2
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_8cac2243
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_da8d30c0
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_877871f9
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_c098953e
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_aa649f9f
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_097498ec
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_bfc534b5
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_7604c74a
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_a1fd96bb
> > >> ==75586== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_4c8857d8
> > >> valgrind: Startup or configuration error:
> > >> valgrind:Can't create client cmdline file in
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_4c8857d8
> > >> valgrind: Unable to start up properly.  Giving up.
> > >> ==75596== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_bc5492bb
> > >> ==75596== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_ec59a3d8
> > >> valgrind: Startup or configuration error:
> > >> valgrind:Can't create client cmdline file in
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_ec59a3d8
> > >> valgrind: Unable to start up properly.  Giving up.
> > >> ==75597== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_b036bdf2
> > >> ==75597== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_105acc43
> > >> ==75597== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_9fb792c0
> > >> ==75597== VG_(mkstemp): failed to create temp file:
> /var/tmp/pbs.3110013.sawtoothpb

[petsc-users] valgrind with petscmpiexec

2020-12-14 Thread Fande Kong
Hi All,

I tried to use valgrind to check if the simulation is valgrind clean
because I saw some random communication fails during the simulation.

I tried this command-line

petscmpiexec -valgrind -n 576  ../../../moose-app-oprof  -i input.i
-log_view -snes_view


But I got the following error messages:

valgrind: Unable to start up properly.  Giving up.
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_8c3fabf2
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_8cac2243
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_da8d30c0
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_877871f9
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_c098953e
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_aa649f9f
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_097498ec
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_bfc534b5
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_7604c74a
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_a1fd96bb
==75586== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_4c8857d8
valgrind: Startup or configuration error:
valgrind:Can't create client cmdline file in
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75586_cmdline_4c8857d8
valgrind: Unable to start up properly.  Giving up.
==75596== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_bc5492bb
==75596== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_ec59a3d8
valgrind: Startup or configuration error:
valgrind:Can't create client cmdline file in
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75596_cmdline_ec59a3d8
valgrind: Unable to start up properly.  Giving up.
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_b036bdf2
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_105acc43
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_9fb792c0
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_30602bf9
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_21eec73e
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_0b53e99f
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_73e31aec
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_486e8eb5
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_db8c194a
==75597== VG_(mkstemp): failed to create temp file:
/var/tmp/pbs.3110013.sawtoothpbs/valgrind_proc_75597_cmdline_839780bb


I did a bit search online, and found something related
https://stackoverflow.com/questions/13707211/what-causes-mkstemp-to-fail-when-running-many-simultaneous-valgrind-processes

But do not know what is the right way to fix the issue.

Thanks so much,

Fande,


[petsc-users] -snes_no_convergence_test was removed from PETSc?

2020-11-05 Thread Fande Kong
Hi All,

I can not find the actual implementation for that. And "
-snes_no_convergence_test"  does not impact anything for me.

Thanks,

Fande,


Re: [petsc-users] EPSMonitorSet

2020-08-31 Thread Fande Kong
Oh, cool.

Thanks, Jose,

I will try that.

Fande,

On Mon, Aug 31, 2020 at 11:11 AM Jose E. Roman  wrote:

> Call EPSMonitorCancel() before EPSMonitorSet().
> Jose
>
>
> > El 31 ago 2020, a las 18:33, Fande Kong  escribió:
> >
> > Hi All,
> >
> > There is a statement on API EPSMonitorSet:
> >
> > "Sets an ADDITIONAL function to be called at every iteration to monitor
> the error estimates for each requested eigenpair."
> >
> > I was wondering how to replace SLEPc EPS monitors instead of adding one?
> I want to use my monitor only.
> >
> > Thanks,
> >
> > Fande,
>
>


[petsc-users] EPSMonitorSet

2020-08-31 Thread Fande Kong
Hi All,

There is a statement on API EPSMonitorSet:

"Sets an ADDITIONAL function to be called at every iteration to monitor the
error estimates for each requested eigenpair."

I was wondering how to replace SLEPc EPS monitors instead of adding one? I
want to use my monitor only.

Thanks,

Fande,


Re: [petsc-users] Using edge-weights for partitioning

2020-08-30 Thread Fande Kong
I agreed, Barry.

A year ago, I enabled edge-weights and vertex weights for only ParMETIS and
PTScotch. I did not do the same thing for Chaco, Party, etc.

It is straightforward to do that, and I could add an MR if needed.

Thanks,

Fande,

On Sun, Aug 30, 2020 at 4:20 PM Barry Smith  wrote:

>
>
> On Aug 30, 2020, at 7:33 AM, Mark Adams  wrote:
>
>
>>
>>
>> So, if ParMETIS gives different edge cut as it is expected,
>> MatPartitioningGetUseEdgeWeights and MatPartitioningSetUseEdgeWeights works
>> correctly. Why can't CHACO?
>>
>>>
>>>
> Chaco does not support using edge weights.
>
>
>   The package interfaces  that do not support edge weights should error if
> one requests partitioning with edge weights. Not everyone is born with the
> innate knowledge that the Chaco PETSc interface doesn't support edge
> weights.
>
>
>
>
> https://gitlab.com/petsc/petsc/-/merge_requests/3119
>


Re: [petsc-users] Disable PETSC_HAVE_CLOSURE

2020-08-24 Thread Fande Kong
Thanks for your reply, Jed.


Barry, do you have any comment?

Fande,

On Thu, Aug 20, 2020 at 9:19 AM Jed Brown  wrote:

> Barry, this is a side-effect of your Swift experiment.  Does that need to
> be in a header (even if it's a private header)?
>
> The issue may be that you test with a C compiler and it gets included in
> C++ source.
>
> Fande Kong  writes:
>
> > Hi All,
> >
> > We (moose team) hit an error message when compiling PETSc, recently. The
> > error is related to "PETSC_HAVE_CLOSURE." Everything runs well if I am
> > going to turn this flag off by making the following changes:
> >
> >
> > git diff
> > diff --git a/config/BuildSystem/config/utilities/closure.py
> > b/config/BuildSystem/config/utilities/closure.py
> > index 6341ddf271..930e5b3b1b 100644
> > --- a/config/BuildSystem/config/utilities/closure.py
> > +++ b/config/BuildSystem/config/utilities/closure.py
> > @@ -19,8 +19,8 @@ class Configure(config.base.Configure):
> >   includes = '#include \n'
> >   body = 'int (^closure)(int);'
> >   self.pushLanguage('C')
> > - if self.checkLink(includes, body):
> > - self.addDefine('HAVE_CLOSURE','1')
> > +# if self.checkLink(includes, body):
> > +# self.addDefine('HAVE_CLOSURE','1')
> >  def configure(self):
> >   self.executeTest(self.configureClosure)
> >
> >
> > I was wondering if there exists a configuration option to disable
> "Closure"
> > C syntax?  I did not find one by running "configuration --help"
> >
> > Please let me know if you need more information.
> >
> >
> > Thanks,
> >
> > Fande,
> >
> >
> > In file included from
> >
> /Users/milljm/projects/moose/scripts/../libmesh/src/solvers/petscdmlibmesh.C:25:
> >
> /Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:29:
> > warning: 'PetscVFPrintfSetClosure' initialized and declared 'extern'
> >   15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
> > char*));
> >|  ^~~
> >
> /Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:53:
> > error: expected primary-expression before 'int'
> >   15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
> > char*));
> >|  ^~~
> >  CXX   src/systems/libmesh_opt_la-equation_systems_io.lo
> > In file included from
> > /Users/milljm/projects/moose/petsc/include/petsc/private/dmimpl.h:7,
> >  from
> >
> /Users/milljm/projects/moose/scripts/../libmesh/src/solvers/petscdmlibmeshimpl.C:26:
> >
> /Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:29:
> > warning: 'PetscVFPrintfSetClosure' initialized and declared 'extern'
> >   15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
> > char*));
> >|  ^~~
> >
> /Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:53:
> > error: expected primary-expression before 'int'
> >   15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
> > char*));
>


[petsc-users] Disable PETSC_HAVE_CLOSURE

2020-08-20 Thread Fande Kong
Hi All,

We (moose team) hit an error message when compiling PETSc, recently. The
error is related to "PETSC_HAVE_CLOSURE." Everything runs well if I am
going to turn this flag off by making the following changes:


git diff
diff --git a/config/BuildSystem/config/utilities/closure.py
b/config/BuildSystem/config/utilities/closure.py
index 6341ddf271..930e5b3b1b 100644
--- a/config/BuildSystem/config/utilities/closure.py
+++ b/config/BuildSystem/config/utilities/closure.py
@@ -19,8 +19,8 @@ class Configure(config.base.Configure):
  includes = '#include \n'
  body = 'int (^closure)(int);'
  self.pushLanguage('C')
- if self.checkLink(includes, body):
- self.addDefine('HAVE_CLOSURE','1')
+# if self.checkLink(includes, body):
+# self.addDefine('HAVE_CLOSURE','1')
 def configure(self):
  self.executeTest(self.configureClosure)


I was wondering if there exists a configuration option to disable "Closure"
C syntax?  I did not find one by running "configuration --help"

Please let me know if you need more information.


Thanks,

Fande,


In file included from
/Users/milljm/projects/moose/scripts/../libmesh/src/solvers/petscdmlibmesh.C:25:
/Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:29:
warning: 'PetscVFPrintfSetClosure' initialized and declared 'extern'
  15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
char*));
   |  ^~~
/Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:53:
error: expected primary-expression before 'int'
  15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
char*));
   |  ^~~
 CXX   src/systems/libmesh_opt_la-equation_systems_io.lo
In file included from
/Users/milljm/projects/moose/petsc/include/petsc/private/dmimpl.h:7,
 from
/Users/milljm/projects/moose/scripts/../libmesh/src/solvers/petscdmlibmeshimpl.C:26:
/Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:29:
warning: 'PetscVFPrintfSetClosure' initialized and declared 'extern'
  15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
char*));
   |  ^~~
/Users/milljm/projects/moose/petsc/include/petsc/private/petscimpl.h:15:53:
error: expected primary-expression before 'int'
  15 | PETSC_EXTERN PetscErrorCode PetscVFPrintfSetClosure(int (^)(const
char*));


Re: [petsc-users] ParMETIS vs. CHACO when no partitioning is made

2020-08-17 Thread Fande Kong
For this particular case (one subdoanin), it may be easy to fix in petsc.
We could create a partitioning index filled with zeros.

Fande,

On Mon, Aug 17, 2020 at 5:04 PM Fande Kong  wrote:

> IIRC, Chaco does not produce an arbitrary number of subdomains. The number
> needs to be like 2^n.
>
> ParMETIS and PTScotch are much better, and they are production-level code.
> If there is no particular reason, I would like to suggest staying with
> ParMETIS and PTScotch.
>
> Thanks,
>
> Fande,
>
>
>
> On Fri, Aug 14, 2020 at 10:07 AM Eda Oktay  wrote:
>
>> Dear Barry,
>>
>> Thank you for answering. I am sending a sample code and a binary file.
>>
>> Thanks!
>>
>> Eda
>>
>> Barry Smith , 14 Ağu 2020 Cum, 18:49 tarihinde şunu
>> yazdı:
>>
>>>
>>>Could be a bug in Chaco or its call from PETSc for the special case
>>> of one process. Could you send a sample code that demonstrates the problem?
>>>
>>>   Barry
>>>
>>>
>>> > On Aug 14, 2020, at 8:53 AM, Eda Oktay  wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I am trying to try something. I am using the same MatPartitioning
>>> codes for both CHACO and ParMETIS:
>>> >
>>> > ierr =
>>> MatConvert(SymmA,MATMPIADJ,MAT_INITIAL_MATRIX,);CHKERRQ(ierr);
>>> >   ierr = MatPartitioningCreate(MPI_COMM_WORLD,);CHKERRQ(ierr);
>>> >   ierr = MatPartitioningSetAdjacency(part,AL);CHKERRQ(ierr);
>>> >
>>> >   ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr);
>>> >   ierr = MatPartitioningApply(part,);CHKERRQ(ierr);
>>> >
>>> > After obtaining the IS, I apply this to my original nonsymmetric
>>> matrix and try to get an approximate edge cut.
>>> >
>>> > Except for 1 partitioning, my program completely works for 2,4 and 16
>>> partitionings. However, for 1, ParMETIS gives results where CHACO I guess
>>> doesn't since I am getting errors about the index set.
>>> >
>>> > What is the difference between CHACO and ParMETIS that one works for 1
>>> partitioning and one doesn't?
>>> >
>>> > Thanks!
>>> >
>>> > Eda
>>>
>>>


Re: [petsc-users] ParMETIS vs. CHACO when no partitioning is made

2020-08-17 Thread Fande Kong
IIRC, Chaco does not produce an arbitrary number of subdomains. The number
needs to be like 2^n.

ParMETIS and PTScotch are much better, and they are production-level code.
If there is no particular reason, I would like to suggest staying with
ParMETIS and PTScotch.

Thanks,

Fande,



On Fri, Aug 14, 2020 at 10:07 AM Eda Oktay  wrote:

> Dear Barry,
>
> Thank you for answering. I am sending a sample code and a binary file.
>
> Thanks!
>
> Eda
>
> Barry Smith , 14 Ağu 2020 Cum, 18:49 tarihinde şunu
> yazdı:
>
>>
>>Could be a bug in Chaco or its call from PETSc for the special case of
>> one process. Could you send a sample code that demonstrates the problem?
>>
>>   Barry
>>
>>
>> > On Aug 14, 2020, at 8:53 AM, Eda Oktay  wrote:
>> >
>> > Hi all,
>> >
>> > I am trying to try something. I am using the same MatPartitioning codes
>> for both CHACO and ParMETIS:
>> >
>> > ierr =
>> MatConvert(SymmA,MATMPIADJ,MAT_INITIAL_MATRIX,);CHKERRQ(ierr);
>> >   ierr = MatPartitioningCreate(MPI_COMM_WORLD,);CHKERRQ(ierr);
>> >   ierr = MatPartitioningSetAdjacency(part,AL);CHKERRQ(ierr);
>> >
>> >   ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr);
>> >   ierr = MatPartitioningApply(part,);CHKERRQ(ierr);
>> >
>> > After obtaining the IS, I apply this to my original nonsymmetric matrix
>> and try to get an approximate edge cut.
>> >
>> > Except for 1 partitioning, my program completely works for 2,4 and 16
>> partitionings. However, for 1, ParMETIS gives results where CHACO I guess
>> doesn't since I am getting errors about the index set.
>> >
>> > What is the difference between CHACO and ParMETIS that one works for 1
>> partitioning and one doesn't?
>> >
>> > Thanks!
>> >
>> > Eda
>>
>>


Re: [petsc-users] Only print converged reason when not converged

2020-07-28 Thread Fande Kong
One alternative is to support a plugable KSP/SNESReasonView system. We then
could hook up KSP/SNESReasonView_MOOSE.

We could call our views from SNES/KSP"done"Solve as well if such a
system is not affordable.  What are the final functions we should call,
where we guarantee SNES/KSP is already done?

Thanks,

Fande,

On Tue, Jul 28, 2020 at 12:02 PM Barry Smith  wrote:

>
>   Alex,
>
> The actual printing is done with SNESReasonView() and KSPReasonView()
> I would suggest copying those files to Moose with a name change and
> removing all the code you don't want. Then you can call your versions
> immediately after SNESSolve() and KSPSolve().
>
>Barry
>
>
> > On Jul 28, 2020, at 10:43 AM, Alexander Lindsay <
> alexlindsay...@gmail.com> wrote:
> >
> > To help debug the many emails we get about solves that fail to converge,
> in MOOSE we recently appended `-snes_converged_reason
> -ksp_converged_reason` for every call to `SNESSolve`. Of course, now we
> have users complaining about the new text printed to their screens that
> they didn't have before. Some of them have made a reasonable request to
> only print the convergence reason when the solve has actually failed to
> converge. Is there some way we can only print the reason if we've diverged,
> e.g. if reason < 0 ?
> >
> > Alex
>
>


Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-20 Thread Fande Kong
On Mon, Jul 20, 2020 at 1:14 PM Mark Adams  wrote:

> This is indeed a nasty bug, but having two separate should be useful.
>
> Chris is using Haswell, what MPI are you using? I trust you are not using
> Moose.
>
> Fande what machine/MPI are you using?
>

#define PETSC_MPICC_SHOW
"/apps/local/spack/software/gcc-4.8.5/gcc-9.2.0-bxc7mvbmrfcrusa6ij7ux3exfqabmq5y/bin/gcc
-I/apps/local/mvapich2/2.3.3-gcc-9.2.0/include
-L/apps/local/mvapich2/2.3.3-gcc-9.2.0/lib -Wl,-rpath
-Wl,/apps/local/mvapich2/2.3.3-gcc-9.2.0/lib -Wl,--enable-new-dtags -lmpi"

I guess it is mvapich2-2.3.3.

Here is the machine configuration https://www.top500.org/system/179708/


BTW (if you missed my earlier posts), if I switch to MPT-MPI (a vendor
installed MPI), everything runs well so far.

I will stick with MPT from now.

Thanks,

Fande,



>
> On Mon, Jul 20, 2020 at 3:04 PM Chris Hewson  wrote:
>
>> Hi Mark,
>>
>> Chris: It sounds like you just have one matrix that you give to MUMPS.
>> You seem to be creating a matrix in the middle of your run. Are you doing
>> dynamic adaptivity?
>> - I have 2 separate matrices I give to mumps, but as this is happening in
>> the production build of my code, I can't determine with certainty what call
>> to MUMPS it's happening or what call to KSPBCGS or UMFPACK it's happening
>> in.
>>
>> I do destroy and recreate matrices in the middle of my runs, but this
>> happens multiple times before the fault happens and in (presumably) the
>> same way. I also do checks on matrix sizes and what I am sending to PETSc
>> and those all pass, just at some point there are size mismatches
>> somewhere, understandably this is not a lot to go on. I am not doing
>> dynamic adaptivity, the mesh is instead changing its size.
>>
>> And I agree with Fande, the most frustrating part is that it's not
>> reproducible, but yah not 100% sure that the problem lies within the PETSc
>> code base either.
>>
>> Current working theories are:
>> 1. Some sort of MPI problem with the sending of one the matrix elements
>> (using mpich version 3.3a2)
>> 2. Some of the memory of static pointers gets corrupted, although I would
>> expect a garbage number and not something that could possibly make sense.
>>
>> *Chris Hewson*
>> Senior Reservoir Simulation Engineer
>> ResFrac
>> +1.587.575.9792
>>
>>
>> On Mon, Jul 20, 2020 at 12:41 PM Mark Adams  wrote:
>>
>>>
>>>
>>> On Mon, Jul 20, 2020 at 2:36 PM Fande Kong  wrote:
>>>
>>>> Hi Mark,
>>>>
>>>> Just to be clear, I do not think it is related to GAMG or PtAP. It is a
>>>> communication issue:
>>>>
>>>
>>> Youe stack trace was from PtAP, but Chris's problem is not.
>>>
>>>
>>>>
>>>> Reran the same code, and I just got :
>>>>
>>>> [252]PETSC ERROR: - Error Message
>>>> --
>>>> [252]PETSC ERROR: Petsc has generated inconsistent data
>>>> [252]PETSC ERROR: Received vector entry 4469094877509280860 out of
>>>> local range [255426072,256718616)]
>>>>
>>>
>>> OK, now this (4469094877509280860) is clearly garbage. THat is the
>>> important thing.  I have to think your MPI is buggy.
>>>
>>>
>>>


Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-20 Thread Fande Kong
The most frustrating part is that the issue is not reproducible.

Fande,

On Mon, Jul 20, 2020 at 12:36 PM Fande Kong  wrote:

> Hi Mark,
>
> Just to be clear, I do not think it is related to GAMG or PtAP. It is a
> communication issue:
>
> Reran the same code, and I just got :
>
> [252]PETSC ERROR: - Error Message
> --
> [252]PETSC ERROR: Petsc has generated inconsistent data
> [252]PETSC ERROR: Received vector entry 4469094877509280860 out of local
> range [255426072,256718616)]
> [252]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [252]PETSC ERROR: Petsc Release Version 3.13.3, unknown
> [252]PETSC ERROR: ../../griffin-opt on a arch-moose named r5i4n13 by kongf
> Mon Jul 20 12:16:47 2020
> [252]PETSC ERROR: Configure options --download-hypre=1 --with-debugging=no
> --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
> --download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1
> --download-mumps=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1
> --with-cxx-dialect=C++11 --with-fortran-bindings=0 --with-sowing=0
> --with-64-bit-indices --download-mumps=0
> [252]PETSC ERROR: #1 VecAssemblyEnd_MPI_BTS() line 324 in
> /home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/impls/mpi/pbvec.c
> [252]PETSC ERROR: #2 VecAssemblyEnd() line 171 in
> /home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/interface/vector.c
> [cli_252]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 252
>
>
> Thanks,
>
> Fande,
>
> On Mon, Jul 20, 2020 at 12:24 PM Mark Adams  wrote:
>
>> OK, so this is happening in MatProductNumeric_PtAP. This must be in
>> constructing the coarse grid.
>>
>> GAMG sort of wants to coarse at a rate of 30:1 but that needs to be
>> verified. With that your index is at about the size of the first coarse
>> grid. I'm trying to figure out if the index is valid. But the size of the
>> max-index is 740521. This is about what I would guess is the size of the
>> second coarse grid.
>>
>> So it kinda looks like it has a "fine" grid index in the "coarse" grid
>> (2nd - 3rd coarse grids).
>>
>> But Chris is not using GAMG.
>>
>> Chris: It sounds like you just have one matrix that you give to MUMPS.
>> You seem to be creating a matrix in the middle of your run. Are you doing
>> dynamic adaptivity?
>>
>> I think we generate unique tags for each operation but it sounds like
>> maybe a message is getting mixed up in some way.
>>
>>
>>
>> On Mon, Jul 20, 2020 at 12:35 PM Fande Kong  wrote:
>>
>>> Hi Mark,
>>>
>>> Thanks for your reply.
>>>
>>> On Mon, Jul 20, 2020 at 7:13 AM Mark Adams  wrote:
>>>
>>>> Fande,
>>>> do you know if your 45226154 was out of range in the real  matrix?
>>>>
>>>
>>> I do not know since it was in building the AMG hierarchy.  The size of
>>> the original system is 1,428,284,880
>>>
>>>
>>>> What size integers do you use?
>>>>
>>>
>>> We are using 64-bit via "--with-64-bit-indices"
>>>
>>>
>>> I am trying to catch the cause of this issue by running more simulations
>>> with different configurations.
>>>
>>> Thanks,
>>>
>>> Fande,
>>>
>>>
>>> Thanks,
>>>> Mark
>>>>
>>>> On Mon, Jul 20, 2020 at 1:17 AM Fande Kong  wrote:
>>>>
>>>>> Trace could look like this:
>>>>>
>>>>> [640]PETSC ERROR: - Error Message
>>>>> --
>>>>>
>>>>> [640]PETSC ERROR: Argument out of range
>>>>>
>>>>> [640]PETSC ERROR: key 45226154 is greater than largest key allowed
>>>>> 740521
>>>>>
>>>>> [640]PETSC ERROR: See
>>>>> https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>>> shooting.
>>>>>
>>>>> [640]PETSC ERROR: Petsc Release Version 3.13.3, unknown
>>>>>
>>>>> [640]PETSC ERROR: ../../griffin-opt on a arch-moose named r6i5n18 by
>>>>> wangy2 Sun Jul 19 17:14:28 2020
>>>>>
>>>>> [640]PETSC ERROR: Configure options --download-hypre=1
>>>>> --with-debugging=no --with-shared-libraries=1 --download-fblasla

Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-20 Thread Fande Kong
Hi Mark,

Just to be clear, I do not think it is related to GAMG or PtAP. It is a
communication issue:

Reran the same code, and I just got :

[252]PETSC ERROR: - Error Message
--
[252]PETSC ERROR: Petsc has generated inconsistent data
[252]PETSC ERROR: Received vector entry 4469094877509280860 out of local
range [255426072,256718616)]
[252]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[252]PETSC ERROR: Petsc Release Version 3.13.3, unknown
[252]PETSC ERROR: ../../griffin-opt on a arch-moose named r5i4n13 by kongf
Mon Jul 20 12:16:47 2020
[252]PETSC ERROR: Configure options --download-hypre=1 --with-debugging=no
--with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
--download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1
--download-mumps=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1
--with-cxx-dialect=C++11 --with-fortran-bindings=0 --with-sowing=0
--with-64-bit-indices --download-mumps=0
[252]PETSC ERROR: #1 VecAssemblyEnd_MPI_BTS() line 324 in
/home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/impls/mpi/pbvec.c
[252]PETSC ERROR: #2 VecAssemblyEnd() line 171 in
/home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/interface/vector.c
[cli_252]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 252


Thanks,

Fande,

On Mon, Jul 20, 2020 at 12:24 PM Mark Adams  wrote:

> OK, so this is happening in MatProductNumeric_PtAP. This must be in
> constructing the coarse grid.
>
> GAMG sort of wants to coarse at a rate of 30:1 but that needs to be
> verified. With that your index is at about the size of the first coarse
> grid. I'm trying to figure out if the index is valid. But the size of the
> max-index is 740521. This is about what I would guess is the size of the
> second coarse grid.
>
> So it kinda looks like it has a "fine" grid index in the "coarse" grid
> (2nd - 3rd coarse grids).
>
> But Chris is not using GAMG.
>
> Chris: It sounds like you just have one matrix that you give to MUMPS. You
> seem to be creating a matrix in the middle of your run. Are you doing
> dynamic adaptivity?
>
> I think we generate unique tags for each operation but it sounds like
> maybe a message is getting mixed up in some way.
>
>
>
> On Mon, Jul 20, 2020 at 12:35 PM Fande Kong  wrote:
>
>> Hi Mark,
>>
>> Thanks for your reply.
>>
>> On Mon, Jul 20, 2020 at 7:13 AM Mark Adams  wrote:
>>
>>> Fande,
>>> do you know if your 45226154 was out of range in the real  matrix?
>>>
>>
>> I do not know since it was in building the AMG hierarchy.  The size of
>> the original system is 1,428,284,880
>>
>>
>>> What size integers do you use?
>>>
>>
>> We are using 64-bit via "--with-64-bit-indices"
>>
>>
>> I am trying to catch the cause of this issue by running more simulations
>> with different configurations.
>>
>> Thanks,
>>
>> Fande,
>>
>>
>> Thanks,
>>> Mark
>>>
>>> On Mon, Jul 20, 2020 at 1:17 AM Fande Kong  wrote:
>>>
>>>> Trace could look like this:
>>>>
>>>> [640]PETSC ERROR: - Error Message
>>>> --
>>>>
>>>> [640]PETSC ERROR: Argument out of range
>>>>
>>>> [640]PETSC ERROR: key 45226154 is greater than largest key allowed
>>>> 740521
>>>>
>>>> [640]PETSC ERROR: See
>>>> https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>> shooting.
>>>>
>>>> [640]PETSC ERROR: Petsc Release Version 3.13.3, unknown
>>>>
>>>> [640]PETSC ERROR: ../../griffin-opt on a arch-moose named r6i5n18 by
>>>> wangy2 Sun Jul 19 17:14:28 2020
>>>>
>>>> [640]PETSC ERROR: Configure options --download-hypre=1
>>>> --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1
>>>> --download-metis=1 --download-ptscotch=1 --download-parmetis=1
>>>> --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1
>>>> --download-slepc=1 --with-mpi=1 --with-cxx-dialect=C++11
>>>> --with-fortran-bindings=0 --with-sowing=0 --with-64-bit-indices
>>>> --download-mumps=0
>>>>
>>>> [640]PETSC ERROR: #1 PetscTableFind() line 132 in
>>>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/include/petscctable.h
>>>>
>>>> [640]PETSC ERROR: #2 MatSetUpMultiply_MPIAIJ() line 33 in
>>>> 

Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-20 Thread Fande Kong
Hi Mark,

Thanks for your reply.

On Mon, Jul 20, 2020 at 7:13 AM Mark Adams  wrote:

> Fande,
> do you know if your 45226154 was out of range in the real  matrix?
>

I do not know since it was in building the AMG hierarchy.  The size of the
original system is 1,428,284,880


> What size integers do you use?
>

We are using 64-bit via "--with-64-bit-indices"


I am trying to catch the cause of this issue by running more simulations
with different configurations.

Thanks,

Fande,


Thanks,
> Mark
>
> On Mon, Jul 20, 2020 at 1:17 AM Fande Kong  wrote:
>
>> Trace could look like this:
>>
>> [640]PETSC ERROR: - Error Message
>> --
>>
>> [640]PETSC ERROR: Argument out of range
>>
>> [640]PETSC ERROR: key 45226154 is greater than largest key allowed 740521
>>
>> [640]PETSC ERROR: See
>> https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>> shooting.
>>
>> [640]PETSC ERROR: Petsc Release Version 3.13.3, unknown
>>
>> [640]PETSC ERROR: ../../griffin-opt on a arch-moose named r6i5n18 by
>> wangy2 Sun Jul 19 17:14:28 2020
>>
>> [640]PETSC ERROR: Configure options --download-hypre=1
>> --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1
>> --download-metis=1 --download-ptscotch=1 --download-parmetis=1
>> --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1
>> --download-slepc=1 --with-mpi=1 --with-cxx-dialect=C++11
>> --with-fortran-bindings=0 --with-sowing=0 --with-64-bit-indices
>> --download-mumps=0
>>
>> [640]PETSC ERROR: #1 PetscTableFind() line 132 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/include/petscctable.h
>>
>> [640]PETSC ERROR: #2 MatSetUpMultiply_MPIAIJ() line 33 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mmaij.c
>>
>> [640]PETSC ERROR: #3 MatAssemblyEnd_MPIAIJ() line 876 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mpiaij.c
>>
>> [640]PETSC ERROR: #4 MatAssemblyEnd() line 5347 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c
>>
>> [640]PETSC ERROR: #5 MatPtAPNumeric_MPIAIJ_MPIXAIJ_allatonce() line 901
>> in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mpiptap.c
>>
>> [640]PETSC ERROR: #6 MatPtAPNumeric_MPIAIJ_MPIMAIJ_allatonce() line 3180
>> in /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/maij/maij.c
>>
>> [640]PETSC ERROR: #7 MatProductNumeric_PtAP() line 704 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matproduct.c
>>
>> [640]PETSC ERROR: #8 MatProductNumeric() line 759 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matproduct.c
>>
>> [640]PETSC ERROR: #9 MatPtAP() line 9199 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c
>>
>> [640]PETSC ERROR: #10 MatGalerkin() line 10236 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c
>>
>> [640]PETSC ERROR: #11 PCSetUp_MG() line 745 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/impls/mg/mg.c
>>
>> [640]PETSC ERROR: #12 PCSetUp_HMG() line 220 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/impls/hmg/hmg.c
>>
>> [640]PETSC ERROR: #13 PCSetUp() line 898 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/interface/precon.c
>>
>> [640]PETSC ERROR: #14 KSPSetUp() line 376 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c
>>
>> [640]PETSC ERROR: #15 KSPSolve_Private() line 633 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c
>>
>> [640]PETSC ERROR: #16 KSPSolve() line 853 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c
>>
>> [640]PETSC ERROR: #17 SNESSolve_NEWTONLS() line 225 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/snes/impls/ls/ls.c
>>
>> [640]PETSC ERROR: #18 SNESSolve() line 4519 in
>> /home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/snes/interface/snes.c
>>
>> On Sun, Jul 19, 2020 at 11:13 PM Fande Kong  wrote:
>>
>>> I am not entirely sure what is happening, but we encountered similar
>>> issues recently.  It was not reproducible. It might occur at different
>>> stages, and errors could be weird other than "ctable stuff." Our code was
>>> Valgrind clean since every PR in moose needs to go through rigorous
>>&

Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-19 Thread Fande Kong
Trace could look like this:

[640]PETSC ERROR: - Error Message
--

[640]PETSC ERROR: Argument out of range

[640]PETSC ERROR: key 45226154 is greater than largest key allowed 740521

[640]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[640]PETSC ERROR: Petsc Release Version 3.13.3, unknown

[640]PETSC ERROR: ../../griffin-opt on a arch-moose named r6i5n18 by wangy2
Sun Jul 19 17:14:28 2020

[640]PETSC ERROR: Configure options --download-hypre=1 --with-debugging=no
--with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
--download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1
--download-mumps=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1
--with-cxx-dialect=C++11 --with-fortran-bindings=0 --with-sowing=0
--with-64-bit-indices --download-mumps=0

[640]PETSC ERROR: #1 PetscTableFind() line 132 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/include/petscctable.h

[640]PETSC ERROR: #2 MatSetUpMultiply_MPIAIJ() line 33 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mmaij.c

[640]PETSC ERROR: #3 MatAssemblyEnd_MPIAIJ() line 876 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mpiaij.c

[640]PETSC ERROR: #4 MatAssemblyEnd() line 5347 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c

[640]PETSC ERROR: #5 MatPtAPNumeric_MPIAIJ_MPIXAIJ_allatonce() line 901 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/aij/mpi/mpiptap.c

[640]PETSC ERROR: #6 MatPtAPNumeric_MPIAIJ_MPIMAIJ_allatonce() line 3180 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/impls/maij/maij.c

[640]PETSC ERROR: #7 MatProductNumeric_PtAP() line 704 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matproduct.c

[640]PETSC ERROR: #8 MatProductNumeric() line 759 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matproduct.c

[640]PETSC ERROR: #9 MatPtAP() line 9199 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c

[640]PETSC ERROR: #10 MatGalerkin() line 10236 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/mat/interface/matrix.c

[640]PETSC ERROR: #11 PCSetUp_MG() line 745 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/impls/mg/mg.c

[640]PETSC ERROR: #12 PCSetUp_HMG() line 220 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/impls/hmg/hmg.c

[640]PETSC ERROR: #13 PCSetUp() line 898 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/pc/interface/precon.c

[640]PETSC ERROR: #14 KSPSetUp() line 376 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c

[640]PETSC ERROR: #15 KSPSolve_Private() line 633 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c

[640]PETSC ERROR: #16 KSPSolve() line 853 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/ksp/ksp/interface/itfunc.c

[640]PETSC ERROR: #17 SNESSolve_NEWTONLS() line 225 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/snes/impls/ls/ls.c

[640]PETSC ERROR: #18 SNESSolve() line 4519 in
/home/wangy2/trunk/sawtooth/griffin/moose/petsc/src/snes/interface/snes.c

On Sun, Jul 19, 2020 at 11:13 PM Fande Kong  wrote:

> I am not entirely sure what is happening, but we encountered similar
> issues recently.  It was not reproducible. It might occur at different
> stages, and errors could be weird other than "ctable stuff." Our code was
> Valgrind clean since every PR in moose needs to go through rigorous
> Valgrind checks before it reaches the devel branch.  The errors happened
> when we used mvapich.
>
> We changed to use HPE-MPT (a vendor stalled MPI), then everything was
> smooth.  May you try a different MPI? It is better to try a system carried
> one.
>
> We did not get the bottom of this problem yet, but we at least know this
> is kind of MPI-related.
>
> Thanks,
>
> Fande,
>
>
> On Sun, Jul 19, 2020 at 3:28 PM Chris Hewson  wrote:
>
>> Hi,
>>
>> I am having a bug that is occurring in PETSC with the return string:
>>
>> [7]PETSC ERROR: PetscTableFind() line 132 in
>> /home/chewson/petsc-3.13.2/include/petscctable.h key 7556 is greater than
>> largest key allowed 5693
>>
>> This is using petsc-3.13.2, compiled and running using mpich with -O3 and
>> debugging turned off tuned to the haswell architecture and occurring either
>> before or during a KSPBCGS solve/setup or during a MUMPS factorization
>> solve (I haven't been able to replicate this issue with the same set of
>> instructions etc.).
>>
>> This is a terrible way to ask a question, I know, and not very helpful
>> from your side, but this is what I have from a user's run and can't
>> reproduce on my end (either with the optimiza

Re: [petsc-users] Tough to reproduce petsctablefind error

2020-07-19 Thread Fande Kong
I am not entirely sure what is happening, but we encountered similar issues
recently.  It was not reproducible. It might occur at different stages, and
errors could be weird other than "ctable stuff." Our code was Valgrind
clean since every PR in moose needs to go through rigorous Valgrind checks
before it reaches the devel branch.  The errors happened when we used
mvapich.

We changed to use HPE-MPT (a vendor stalled MPI), then everything was
smooth.  May you try a different MPI? It is better to try a system carried
one.

We did not get the bottom of this problem yet, but we at least know this is
kind of MPI-related.

Thanks,

Fande,


On Sun, Jul 19, 2020 at 3:28 PM Chris Hewson  wrote:

> Hi,
>
> I am having a bug that is occurring in PETSC with the return string:
>
> [7]PETSC ERROR: PetscTableFind() line 132 in
> /home/chewson/petsc-3.13.2/include/petscctable.h key 7556 is greater than
> largest key allowed 5693
>
> This is using petsc-3.13.2, compiled and running using mpich with -O3 and
> debugging turned off tuned to the haswell architecture and occurring either
> before or during a KSPBCGS solve/setup or during a MUMPS factorization
> solve (I haven't been able to replicate this issue with the same set of
> instructions etc.).
>
> This is a terrible way to ask a question, I know, and not very helpful
> from your side, but this is what I have from a user's run and can't
> reproduce on my end (either with the optimization compilation or with
> debugging turned on). This happens when the code has run for quite some
> time and is happening somewhat rarely.
>
> More than likely I am using a static variable (code is written in c++)
> that I'm not updating when the matrix size is changing or something silly
> like that, but any help or guidance on this would be appreciated.
>
> *Chris Hewson*
> Senior Reservoir Simulation Engineer
> ResFrac
> +1.587.575.9792
>


[petsc-users] VecAssemblyEnd_MPI_BTS

2020-07-14 Thread Fande Kong
Hi All,


I was doing a large-scale simulation using 12288 cores and had the
following error. The code ran fine using less than 12288 cores.

Any quick suggestions to track down this issue?

Thanks,

Fande,


[3342]PETSC ERROR: - Error Message
--
[3342]PETSC ERROR: Petsc has generated inconsistent data
[3342]PETSC ERROR: Received vector entry 0 out of local range
[344829312,344964096)]
[3342]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[3342]PETSC ERROR: Petsc Release Version 3.13.3, unknown
[3342]PETSC ERROR: /home/kongf/workhome/sawtooth/griffin/griffin-opt on a
arch-moose named r1i4n34 by kongf Tue Jul 14 08:44:02 2020
[3342]PETSC ERROR: Configure options --download-hypre=1 --with-debugging=no
--with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1
--download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1
--download-mumps=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1
--with-cxx-dialect=C++11 --with-fortran-bindings=0 --with-sowing=0
--with-64-bit-indices --download-mumps=0
[3342]PETSC ERROR: #1 VecAssemblyEnd_MPI_BTS() line 324 in
/home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/impls/mpi/pbvec.c
[3342]PETSC ERROR: #2 VecAssemblyEnd() line 171 in
/home/kongf/workhome/sawtooth/moosers/petsc/src/vec/vec/interface/vector.c
[cli_3342]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 3342


Re: [petsc-users] Make stream

2020-06-16 Thread Fande Kong
Thanks, Jed,

It is fascinating. I will try to check if I can do anything to have this
kind of improvement as well.

Thanks,

Fande,

On Fri, Jun 12, 2020 at 7:43 PM Jed Brown  wrote:

> Jed Brown  writes:
>
> > Fande Kong  writes:
> >
> >>> There's a lot more to AMG setup than memory bandwidth (architecture
> >>> matters a lot, even between different generation CPUs).
> >>
> >>
> >> Could you elaborate a bit more on this? From my understanding, one big
> part
> >> of AMG SetUp is RAP that should be pretty much bandwidth.
> >
> > The RAP isn't "pretty much bandwidth".  See below for some
> > Skylake/POWER9/EPYC results and analysis (copied from an off-list
> > thread).  I'll leave in some other bandwidth comments that may or may
> > not be relevant to you.  The short story is that Skylake and EPYC are
> > both much better than POWER9 at MatPtAP despite POWER9 having similar
> > bandwidth as EPYC and thus being significantly faster than Skylake for
> > MatMult/smoothing.
> >
> >
> > Jed Brown  writes:
> >
> >> I'm attaching a log from my machine (Noether), which is 2-socket EPYC
> >> 7452 (32 cores each).  Each socket has 8xDDR4-3200 and 128 MB of L3
> >> cache.  This is the same node architecture as the new BER/E3SM machine
> >> being installed at Argonne (though that one will probably have
> >> higher-clocked and/or more cores per socket).  Note that these CPUs are
> >> about $2k each while Skylake 8180 are about $10k.
> >>
> >> Some excerpts/comments below.
> >>
> >
> >  [...]
> >
> >  In addition to the notes below, I'd like to call out how important
> >  streaming stores are on EPYC.  With vanilla code or _mm256_store_pd, we
> >  get the following performance
> >
> >$ mpiexec -n 64 --bind-to core --map-by core:1
> src/benchmarks/streams/MPIVersion
> >Copy 162609.2392   Scale 159119.8259   Add 174687.6250   Triad
> 175840.1587
> >
> >  but replacing _mm256_store_pd with _mm256_stream_pd gives this
> >
> >$ mpiexec -n 64 --bind-to core --map-by core:1
> src/benchmarks/streams/MPIVersion
> >Copy 259951.9936   Scale 259381.0589   Add 250216.3389   Triad
> 249292.9701
>
> I turned on NPS4 (a BIOS setting that creates a NUMA node for each pair
> of memory channels) and get a modest performance boost.
>
> $ mpiexec -n 64 --bind-to core --map-by core:1
> src/benchmarks/streams/MPIVersion
>
> Copy 289645.3776   Scale 289186.2783   Add 273220.0133   Triad 272911.2263
>
> On this architecture, best performance comes from one process per 4-core
> CCX (shared L3).
>
> $ mpiexec -n 16 --bind-to core --map-by core:4
> src/benchmarks/streams/MPIVersion
>
> Copy 300704.8859   Scale 304556.3380   Add 295970.1132   Triad 298891.3821
>
> >  This is just preposterously huge, but very repeatable using gcc and
> >  clang, and inspecting the assembly.  This suggests that it would be
> >  useful for vector kernels to have streaming and non-streaming variants.
> >  That is, if I drop the vector length by 20 (so the working set is 2.3
> >  MB/core instead of 46 MB in the default version), then we get 2.4 TB/s
> >  Triad with _mm256_store_pd:
> >
> >$ mpiexec -n 64 --bind-to core --map-by core:1
> src/benchmarks/streams/MPIVersion
> >Copy 2159915.7058   Scale 2212671.7087   Add 2414758.2757   Triad
> 2402671.1178
> >
> >  and a thoroughly embarrassing 353 GB/s with _mm256_stream_pd:
> >
> >$ mpiexec -n 64 --bind-to core --map-by core:1
> src/benchmarks/streams/MPIVersion
> >Copy 235934.6653   Scale 237446.8507   Add 352805.7288   Triad
> 352992.9692
> >
> >
> >  I don't know a good way to automatically determine whether to expect the
> >  memory to be in cache, but we could make it a global (or per-object)
> >  run-time selection.
> >
> >> Jed Brown  writes:
> >>
> >>> "Smith, Barry F."  writes:
> >>>
> >>>>Thanks. The PowerPC is pretty crappy compared to Skylake.
> >>>
> >>> Compare the MGSmooth times.  The POWER9 is faster than the Skylake
> >>> because it has more memory bandwidth.
> >>>
> >>> $ rg 'MGInterp Level 4|MGSmooth Level 4' ex56*
> >>> ex56-JLSE-skylake-56ranks-converged.txt
> >>> 254:MGSmooth Level 4  68 1.0 1.8808e+00 1.2 7.93e+08 1.3 3.6e+04
> 1.9e+04 3.4e+01  8 29 10 16  3  62 60 18 54 25 22391
> >>> 256:MGInterp Level 4  68 1.0 4.0043e-01 1.8 1.45e+08 1.3 2.2e+04
> 2.5e+03 0.0e+00  1  5

Re: [petsc-users] Make stream

2020-06-09 Thread Fande Kong
Thanks, Jed,

On Tue, Jun 9, 2020 at 3:19 PM Jed Brown  wrote:

> Fande Kong  writes:
>
> > Hi All,
> >
> > I am trying to interpret the results from "make stream" on two compute
> > nodes, where each node has 48 cores.
> >
> > If my calculations are memory bandwidth limited, such as AMG, MatVec,
> > GMRES, etc..
>
> There's a lot more to AMG setup than memory bandwidth (architecture
> matters a lot, even between different generation CPUs).


Could you elaborate a bit more on this? From my understanding, one big part
of AMG SetUp is RAP that should be pretty much bandwidth.

So the graph coarsening part is affected by architechniques?

MatMult and
> Krylov are almost pure bandwidth.
>
> > The best speedup I could get is 16.6938 if I start from one core?? The
> > speedup for function evaluations and Jacobian evaluations can be better
> > than16.6938?
>
> Residual and Jacobians can be faster, especially if your code is slow
> (poorly vectorized, branchy, or has a lot of arithmetic).
>

It will be branchy when we handle complicated mutphyics.


>
> Are you trying to understand perf on current hardware or make decisions
> about new hardware?
>

The nodes are INL supercomputer nodes. I am trying to understand what could
be the best speedup I could get when running moose/petsc on that machine
for the linear algebra part.


Thanks,

Fande,


Re: [petsc-users] Make stream

2020-06-09 Thread Fande Kong
Thanks so much, Barry,

On Tue, Jun 9, 2020 at 3:08 PM Barry Smith  wrote:

>
>You might look at the notes about MPI binding. It might give you a bit
> better performance.
> https://www.mcs.anl.gov/petsc/documentation/faq.html#computers
>

I am using mvapich2, and still trying to look for which binding command
lines I can use.




>
>The streams is exactly the DAXPY operation so this is the speed up you
> should expect for VecAXPY() which has 2 loads and 1 store per 1 multipy and
> 1 add
>
>VecDot() has 2 loads per 1 multiply and 1 add but also a global
> reduction
>
>Sparse multiply with AIJ has an integer load, 2 double loads plus 1
> store per row with 1 multiply and 1 add plus communication needed for
> off-process portion
>
>Function evaluations often have higher arithmetic intensity so should
> give a bit higher speedup
>
>Jacobian evaluations often have higher arithmetic intensity but they
> may have MatSetValues() which is slow because no arithmetic intensity just
> memory motion
>

Got it.

Thanks,

Fande,


>
>Barry
>
>
>
> On Jun 9, 2020, at 3:43 PM, Fande Kong  wrote:
>
> Hi All,
>
> I am trying to interpret the results from "make stream" on two compute
> nodes, where each node has 48 cores.
>
> If my calculations are memory bandwidth limited, such as AMG, MatVec,
> GMRES, etc..
> The best speedup I could get is 16.6938 if I start from one core?? The
> speedup for function evaluations and Jacobian evaluations can be better
> than16.6938?
>
> Thanks,
>
> Fande,
>
>
>
> Running streams with 'mpiexec ' using 'NPMAX=96'
> 1 19412.4570  Rate (MB/s)
> 2 29457.3988  Rate (MB/s) 1.51744
> 3 40483.9318  Rate (MB/s) 2.08546
> 4 51429.3431  Rate (MB/s) 2.64929
> 5 59849.5168  Rate (MB/s) 3.08304
> 6 66124.3461  Rate (MB/s) 3.40628
> 7 70888.1170  Rate (MB/s) 3.65167
> 8 73436.2374  Rate (MB/s) 3.78294
> 9 77441.7622  Rate (MB/s) 3.98927
> 10 78115.3114  Rate (MB/s) 4.02397
> 11 81449.3315  Rate (MB/s) 4.19572
> 12 82812.3471  Rate (MB/s) 4.26593
> 13 81442.2114  Rate (MB/s) 4.19535
> 14 83404.1657  Rate (MB/s) 4.29642
> 15 84165.8536  Rate (MB/s) 4.33565
> 16 83739.2910  Rate (MB/s) 4.31368
> 17 83724.8109  Rate (MB/s) 4.31293
> 18 83225.0743  Rate (MB/s) 4.28719
> 19 81668.2002  Rate (MB/s) 4.20699
> 20 83678.8007  Rate (MB/s) 4.31056
> 21 81400.4590  Rate (MB/s) 4.1932
> 22 81944.8975  Rate (MB/s) 4.22124
> 23 81359.8615  Rate (MB/s) 4.19111
> 24 80674.5064  Rate (MB/s) 4.1558
> 25 83761.3316  Rate (MB/s) 4.31481
> 26 87567.4876  Rate (MB/s) 4.51088
> 27 89605.4435  Rate (MB/s) 4.61586
> 28 94984.9755  Rate (MB/s) 4.89298
> 29 98260.5283  Rate (MB/s) 5.06171
> 30 99852.8790  Rate (MB/s) 5.14374
> 31 102736.3576  Rate (MB/s) 5.29228
> 32 108638.7488  Rate (MB/s) 5.59633
> 33 110431.2938  Rate (MB/s) 5.68867
> 34 112824.2031  Rate (MB/s) 5.81194
> 35 116908.3009  Rate (MB/s) 6.02232
> 36 121312.6574  Rate (MB/s) 6.2492
> 37 122507.3172  Rate (MB/s) 6.31074
> 38 127456.2504  Rate (MB/s) 6.56568
> 39 130098.7056  Rate (MB/s) 6.7018
> 40 134956.4461  Rate (MB/s) 6.95204
> 41 138309.2465  Rate (MB/s) 7.12475
> 42 141779.7997  Rate (MB/s) 7.30353
> 43 145653.3687  Rate (MB/s) 7.50307
> 44 149131.2087  Rate (MB/s) 7.68223
> 45 151611.6104  Rate (MB/s) 7.81
> 46 14.6394  Rate (MB/s) 8.01312
> 47 159033.1938  Rate (MB/s) 8.19231
> 48 162216.5600  Rate (MB/s) 8.35629
> 49 165034.8116  Rate (MB/s) 8.50147
> 50 168001.4823  Rate (MB/s) 8.65429
> 51 170899.9045  Rate (MB/s) 8.8036
> 52 175687.8033  Rate (MB/s) 9.05024
> 53 178203.9203  Rate (MB/s) 9.17985
> 54 179973.3914  Rate (MB/s) 9.27101
> 55 182207.3495  Rate (MB/s) 9.38608
> 56 185712.9643  Rate (MB/s) 9.56667
> 57 188805.5696  Rate (MB/s) 9.72598
> 58 193360.9158  Rate (MB/s) 9.96064
> 59 198160.8016  Rate (MB/s) 10.2079
> 60 201297.0129  Rate (MB/s) 10.3695
> 61 203618.7672  Rate (MB/s) 10.4891
> 62 209599.2783  Rate (MB/s) 10.7971
> 63 211651.1587  Rate (MB/s) 10.9028
> 64 210254.5035  Rate (MB/s) 10.8309
> 65 218576.4938  Rate (MB/s) 11.2596
> 66 220280.0853  Rate (MB/s) 11.3473
> 67 221281.1867  Rate (MB/s) 11.3989
> 68 228941.1872  Rate (MB/s) 11.7935
> 69 232206.2708  Rate (MB/s) 11.9617
> 70 233569.5866  Rate (MB/s) 12.0319
> 71 238293.6355  Rate (MB/s) 12.2753
> 72 238987.0729  Rate (MB/s) 12.311
> 73 246013.4684  Rate (MB/s) 12.6729
> 74 248850.8942  Rate (MB/s) 12.8191
> 75 249355.6899  Rate (MB/s) 12.8451
> 76 252515.6110  Rate (MB/s) 13.0079
> 77 257489.4268  Rate (MB/s) 13.2641
> 78 260884.2771  Rate (MB/s) 13.439
> 79 264341.8661  Rate (MB/s) 13.6171
> 80 269329.1376  Rate (MB/s) 13.874
> 81 272286.4070  Rate (MB/s) 1

[petsc-users] Make stream

2020-06-09 Thread Fande Kong
Hi All,

I am trying to interpret the results from "make stream" on two compute
nodes, where each node has 48 cores.

If my calculations are memory bandwidth limited, such as AMG, MatVec,
GMRES, etc..
The best speedup I could get is 16.6938 if I start from one core?? The
speedup for function evaluations and Jacobian evaluations can be better
than16.6938?

Thanks,

Fande,



Running streams with 'mpiexec ' using 'NPMAX=96'
1 19412.4570  Rate (MB/s)
2 29457.3988  Rate (MB/s) 1.51744
3 40483.9318  Rate (MB/s) 2.08546
4 51429.3431  Rate (MB/s) 2.64929
5 59849.5168  Rate (MB/s) 3.08304
6 66124.3461  Rate (MB/s) 3.40628
7 70888.1170  Rate (MB/s) 3.65167
8 73436.2374  Rate (MB/s) 3.78294
9 77441.7622  Rate (MB/s) 3.98927
10 78115.3114  Rate (MB/s) 4.02397
11 81449.3315  Rate (MB/s) 4.19572
12 82812.3471  Rate (MB/s) 4.26593
13 81442.2114  Rate (MB/s) 4.19535
14 83404.1657  Rate (MB/s) 4.29642
15 84165.8536  Rate (MB/s) 4.33565
16 83739.2910  Rate (MB/s) 4.31368
17 83724.8109  Rate (MB/s) 4.31293
18 83225.0743  Rate (MB/s) 4.28719
19 81668.2002  Rate (MB/s) 4.20699
20 83678.8007  Rate (MB/s) 4.31056
21 81400.4590  Rate (MB/s) 4.1932
22 81944.8975  Rate (MB/s) 4.22124
23 81359.8615  Rate (MB/s) 4.19111
24 80674.5064  Rate (MB/s) 4.1558
25 83761.3316  Rate (MB/s) 4.31481
26 87567.4876  Rate (MB/s) 4.51088
27 89605.4435  Rate (MB/s) 4.61586
28 94984.9755  Rate (MB/s) 4.89298
29 98260.5283  Rate (MB/s) 5.06171
30 99852.8790  Rate (MB/s) 5.14374
31 102736.3576  Rate (MB/s) 5.29228
32 108638.7488  Rate (MB/s) 5.59633
33 110431.2938  Rate (MB/s) 5.68867
34 112824.2031  Rate (MB/s) 5.81194
35 116908.3009  Rate (MB/s) 6.02232
36 121312.6574  Rate (MB/s) 6.2492
37 122507.3172  Rate (MB/s) 6.31074
38 127456.2504  Rate (MB/s) 6.56568
39 130098.7056  Rate (MB/s) 6.7018
40 134956.4461  Rate (MB/s) 6.95204
41 138309.2465  Rate (MB/s) 7.12475
42 141779.7997  Rate (MB/s) 7.30353
43 145653.3687  Rate (MB/s) 7.50307
44 149131.2087  Rate (MB/s) 7.68223
45 151611.6104  Rate (MB/s) 7.81
46 14.6394  Rate (MB/s) 8.01312
47 159033.1938  Rate (MB/s) 8.19231
48 162216.5600  Rate (MB/s) 8.35629
49 165034.8116  Rate (MB/s) 8.50147
50 168001.4823  Rate (MB/s) 8.65429
51 170899.9045  Rate (MB/s) 8.8036
52 175687.8033  Rate (MB/s) 9.05024
53 178203.9203  Rate (MB/s) 9.17985
54 179973.3914  Rate (MB/s) 9.27101
55 182207.3495  Rate (MB/s) 9.38608
56 185712.9643  Rate (MB/s) 9.56667
57 188805.5696  Rate (MB/s) 9.72598
58 193360.9158  Rate (MB/s) 9.96064
59 198160.8016  Rate (MB/s) 10.2079
60 201297.0129  Rate (MB/s) 10.3695
61 203618.7672  Rate (MB/s) 10.4891
62 209599.2783  Rate (MB/s) 10.7971
63 211651.1587  Rate (MB/s) 10.9028
64 210254.5035  Rate (MB/s) 10.8309
65 218576.4938  Rate (MB/s) 11.2596
66 220280.0853  Rate (MB/s) 11.3473
67 221281.1867  Rate (MB/s) 11.3989
68 228941.1872  Rate (MB/s) 11.7935
69 232206.2708  Rate (MB/s) 11.9617
70 233569.5866  Rate (MB/s) 12.0319
71 238293.6355  Rate (MB/s) 12.2753
72 238987.0729  Rate (MB/s) 12.311
73 246013.4684  Rate (MB/s) 12.6729
74 248850.8942  Rate (MB/s) 12.8191
75 249355.6899  Rate (MB/s) 12.8451
76 252515.6110  Rate (MB/s) 13.0079
77 257489.4268  Rate (MB/s) 13.2641
78 260884.2771  Rate (MB/s) 13.439
79 264341.8661  Rate (MB/s) 13.6171
80 269329.1376  Rate (MB/s) 13.874
81 272286.4070  Rate (MB/s) 14.0263
82 273325.7822  Rate (MB/s) 14.0799
83 277334.6699  Rate (MB/s) 14.2864
84 280254.7286  Rate (MB/s) 14.4368
85 282219.8194  Rate (MB/s) 14.538
86 289039.2677  Rate (MB/s) 14.8893
87 291234.4715  Rate (MB/s) 15.0024
88 295941.1159  Rate (MB/s) 15.2449
89 298136.3163  Rate (MB/s) 15.358
90 302820.9080  Rate (MB/s) 15.5993
91 306387.5008  Rate (MB/s) 15.783
92 310127.0223  Rate (MB/s) 15.9756
93 310219.3643  Rate (MB/s) 15.9804
94 317089.5971  Rate (MB/s) 16.3343
95 315457.0938  Rate (MB/s) 16.2502
96 324068.8172  Rate (MB/s) 16.6938


Re: [petsc-users] SuperLU + GPUs

2020-04-19 Thread Fande Kong
Hi Mark,

This should help:  -pc_factor_mat_solver_type superlu_dist


Thanks,

Fande 


> On Apr 19, 2020, at 9:41 AM, Mark Adams  wrote:
> 
> 
>> 
>> 
>> > > --download-superlu --download-superlu_dist 
>> 
>> You are installing with both superlu and superlu_dist. To verify - remove 
>> superlu - and keep only superlu_dist
> 
> I tried this earlier. Here is the error message:
> 
>0 SNES Function norm 1.511918966798e-02
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: See 
> https://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for 
> possible LU and Cholesky solvers
> [0]PETSC ERROR: Could not locate solver package superlu for factorization 
> type LU and matrix type seqaij. Perhaps you must ./configure with 
> --download-superlu
> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for 
> trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.13-163-g4c71feb  GIT Date: 
> 2020-04-18 15:35:50 -0400
> [0]PETSC ERROR: ./ex112d on a arch-summit-opt-gnu-cuda-omp-2db named h23n05 
> by adams Sun Apr 19 11:39:05 2020
> [0]PETSC ERROR: Configure options --with-fc=0 --COPTFLAGS="-g -O2 -fPIC 
> -fopenmp -DFP_DIM=2" --CXXOPTFLAGS="-g -O2 -fPIC -fopenmp" --FOPTFLAGS="-g 
> -O2 -fPIC -fopenmp" --CUDAOPTFLAGS="-O2 -g" --with-ssl=0 --with-batch=0 
> --with-cxx=mpicxx --with-mpiexec="jsrun -g1" --with-cuda=1 --with-cudac=nvcc 
> --download-p4est=1 --download-zlib --download-hdf5=1 --download-metis 
> --download-superlu_dist --with-make-np=16 --download-parmetis 
> --download-triangle 
> --with-blaslapack-lib="-L/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/netlib-lapack-3.8.0-wcabdyqhdi5rooxbkqa6x5d7hxyxwdkm/lib64
>  -lblas -llapack" --with-cc=mpicc --with-shared-libraries=1 --with-x=0 
> --with-64-bit-indices=0 --with-debugging=0 
> PETSC_ARCH=arch-summit-opt-gnu-cuda-omp-2db --with-openmp=1 
> --with-threadsaftey=1 --with-log=1
> [0]PETSC ERROR: #1 MatGetFactor() line 4490 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/mat/interface/matrix.c
> [0]PETSC ERROR: #2 PCSetUp_LU() line 88 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ksp/pc/impls/factor/lu/lu.c
> [0]PETSC ERROR: #3 PCSetUp() line 894 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: #4 KSPSetUp() line 376 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #5 KSPSolve_Private() line 633 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #6 KSPSolve() line 853 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #7 SNESSolve_NEWTONLS() line 225 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/snes/impls/ls/ls.c
> [0]PETSC ERROR: #8 SNESSolve() line 4520 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/snes/interface/snes.c
> [0]PETSC ERROR: #9 TSStep_ARKIMEX() line 811 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ts/impls/arkimex/arkimex.c
> [0]PETSC ERROR: #10 TSStep() line 3721 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ts/interface/ts.c
> [0]PETSC ERROR: #11 TSSolve() line 4127 in 
> /autofs/nccs-svm1_home1/adams/petsc/src/ts/interface/ts.c
> [0]PETSC ERROR: #12 main() line 955 in ex11.c
>  
>> 
>> Satish
>> 
>> 
>> > 
>> > 
>> > >
>> > > SuperLU:
>> > >   Version:  5.2.1
>> > >   Includes: -I/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/include
>> > >   Library:  
>> > > -Wl,-rpath,/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib
>> > > -L/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib -lsuperlu
>> > >
>> > > which is serial superlu, not superlu_dist.   These are 2 different codes.
>> > >
>> > > Sherry
>> > >
>> > > On Sat, Apr 18, 2020 at 4:54 PM Mark Adams  wrote:
>> > >
>> > >>
>> > >>
>> > >> On Sat, Apr 18, 2020 at 3:05 PM Xiaoye S. Li  wrote:
>> > >>
>> > >>> Mark,
>> > >>>
>> > >>> It seems you are talking about serial superlu?   There is no GPU 
>> > >>> support
>> > >>> in it.  Only superlu_dist has GPU.
>> > >>>
>> > >>
>> > >> I am using superlu_dist on one processor. Should that work?
>> > >>
>> > >>
>> > >>>
>> > >>> But I don't know why there is a crash.
>> > >>>
>> > >>> Sherry
>> > >>>
>> > >>> On Sat, Apr 18, 2020 at 11:44 AM Mark Adams  wrote:
>> > >>>
>> >  Sherry, I did rebase with master this week:
>> > 
>> >  SuperLU:
>> >    Version:  5.2.1
>> >    Includes: 
>> >  -I/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/include
>> >    Library:
>> >   -Wl,-rpath,/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib
>> >  -L/ccs/home/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib -lsuperlu
>> > 
>> >  I see the same thing with a debug build.
>> > 
>> >  If anyone is interested in looking at this, I was also able to see 
>> >  that
>> >  plex/ex10 in my branch, which is a very simple test , also does not 
>> >  crash

Re: [petsc-users] AIJ vs BAIJ when using ILU factorization

2020-04-17 Thread Fande Kong
Thanks, Hong,

I will try the code with bs=1, and report back to you.

Fande,

On Tue, Mar 31, 2020 at 9:51 PM Zhang, Hong  wrote:

> Fande,
> Checking aij.result:
> Mat Object: () 1 MPI processes
>   type: seqaij
>   rows=25816, cols=25816, bs=4
>   total: nonzeros=1297664, allocated nonzeros=1297664
>   total number of mallocs used during MatSetValues calls=0
> using I-node routines: found 6454 nodes, limit used is 5
>
> i.e., it uses bs=4 with I-node. The implementation of MatSolve() is
> similar to baij with bs=4. What happens if you try aij with
> '-matload_block_size 1 -mat_no_inode true'?
> Hong
>
> ------
> *From:* petsc-users  on behalf of Fande
> Kong 
> *Sent:* Monday, March 30, 2020 12:25 PM
> *To:* PETSc users list 
> *Subject:* [petsc-users] AIJ vs BAIJ when using ILU factorization
>
> Hi All,
>
> There is a system of equations arising from the discretization of 3D
> incompressible Navier-Stoke equations using a finite element method. 4
> unknowns are placed on each mesh point, and then there is a 4x4 saddle
> point block on each mesh vertex.  I was thinking to solve the linear
> equations using an incomplete LU factorization (that will be eventually
> used as a subdomain solver for ASM).
>
> Right now, I am trying to study the ILU performance using AIJ and BAIJ,
> respectively. From my understanding, BAIJ should give me better results
> since it inverses the 4x4 blocks exactly, while AIJ does not. However, I
> found that both BAIJ and AIJ gave me identical results in terms of the
> number of iterations.  Was that just coincident?  Or in theory, they are
> just identical.  I understand the runtimes may be different because BAIJ
> has a better data locality.
>
>
> Please see the attached files for the results and solver configuration.
>
>
> Thanks,
>
> Fande,
>


Re: [petsc-users] How to set an initial guess for TS

2020-04-17 Thread Fande Kong
Thanks Jed,

I will try and let you know,

Thanks again!

Fande,

On Fri, Apr 3, 2020 at 4:29 PM Jed Brown  wrote:

> Oh, you just want an initial guess for SNES?  Does it work to pull out the
> SNES and SNESSetComputeInitialGuess?
>
> Fande Kong  writes:
>
> > No. I am working on a transient loosely coupled multiphysics simulation.
> > Assume there are two physics problems: problem A and problem B. During
> each
> > time step, there is a Picard iteration between problem A and problem B.
> > During each Picard step, you solve problem A (or B) with the solution
> > (U_{n-1}) of the previous time step as the initial condition. In the
> Picard
> > solve stage, I know the solution (\bar{U}_{n}) of the current time step
> but
> > from the previous Picard iteration. Use \bar{U}_{n}) instead of U_{n-1}
> as
> > the initial guess for SNES will have a better convergence for me.
> >
> > Thanks,
> >
> > Fande,
> >
> >
> > On Fri, Apr 3, 2020 at 1:10 PM Jed Brown  wrote:
> >
> >> This sounds like you're talking about a starting procedure for a DAE (or
> >> near-singular ODE)?
> >>
> >> Fande Kong  writes:
> >>
> >> > Hi All,
> >> >
> >> > TSSetSolution will set an initial condition for the current TSSolve().
> >> What
> >> > should I do if I want to set an initial guess for the current solution
> >> that
> >> > is different from the initial condition?  The initial guess is
> supposed
> >> to
> >> > be really close to the current solution, and then will accelerate my
> >> solver.
> >> >
> >> > In other words, TSSetSolution will set "U_{n-1}", and now we call
> TSSolve
> >> > to figure out "U_{n}". If I know something about "U_{n}", and I want
> to
> >> set
> >> > "\bar{U}_{n}" as the initial guess of "U_{n}" when computing "U_{n}".
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Fande,
> >>
>


Re: [petsc-users] How to set an initial guess for TS

2020-04-03 Thread Fande Kong
No. I am working on a transient loosely coupled multiphysics simulation.
Assume there are two physics problems: problem A and problem B. During each
time step, there is a Picard iteration between problem A and problem B.
During each Picard step, you solve problem A (or B) with the solution
(U_{n-1}) of the previous time step as the initial condition. In the Picard
solve stage, I know the solution (\bar{U}_{n}) of the current time step but
from the previous Picard iteration. Use \bar{U}_{n}) instead of U_{n-1} as
the initial guess for SNES will have a better convergence for me.

Thanks,

Fande,


On Fri, Apr 3, 2020 at 1:10 PM Jed Brown  wrote:

> This sounds like you're talking about a starting procedure for a DAE (or
> near-singular ODE)?
>
> Fande Kong  writes:
>
> > Hi All,
> >
> > TSSetSolution will set an initial condition for the current TSSolve().
> What
> > should I do if I want to set an initial guess for the current solution
> that
> > is different from the initial condition?  The initial guess is supposed
> to
> > be really close to the current solution, and then will accelerate my
> solver.
> >
> > In other words, TSSetSolution will set "U_{n-1}", and now we call TSSolve
> > to figure out "U_{n}". If I know something about "U_{n}", and I want to
> set
> > "\bar{U}_{n}" as the initial guess of "U_{n}" when computing "U_{n}".
> >
> >
> > Thanks,
> >
> > Fande,
>


[petsc-users] How to set an initial guess for TS

2020-04-03 Thread Fande Kong
Hi All,

TSSetSolution will set an initial condition for the current TSSolve(). What
should I do if I want to set an initial guess for the current solution that
is different from the initial condition?  The initial guess is supposed to
be really close to the current solution, and then will accelerate my solver.

In other words, TSSetSolution will set "U_{n-1}", and now we call TSSolve
to figure out "U_{n}". If I know something about "U_{n}", and I want to set
"\bar{U}_{n}" as the initial guess of "U_{n}" when computing "U_{n}".


Thanks,

Fande,


[petsc-users] AIJ vs BAIJ when using ILU factorization

2020-03-30 Thread Fande Kong
Hi All,

There is a system of equations arising from the discretization of 3D
incompressible Navier-Stoke equations using a finite element method. 4
unknowns are placed on each mesh point, and then there is a 4x4 saddle
point block on each mesh vertex.  I was thinking to solve the linear
equations using an incomplete LU factorization (that will be eventually
used as a subdomain solver for ASM).

Right now, I am trying to study the ILU performance using AIJ and BAIJ,
respectively. From my understanding, BAIJ should give me better results
since it inverses the 4x4 blocks exactly, while AIJ does not. However, I
found that both BAIJ and AIJ gave me identical results in terms of the
number of iterations.  Was that just coincident?  Or in theory, they are
just identical.  I understand the runtimes may be different because BAIJ
has a better data locality.


Please see the attached files for the results and solver configuration.


Thanks,

Fande,


aij.result
Description: Binary data


baij.result
Description: Binary data


Re: [petsc-users] Poor speed up for KSP example 45

2020-03-25 Thread Fande Kong
In case someone wants to learn more about the hierarchical partitioning 
algorithm. Here is a reference 

https://arxiv.org/pdf/1809.02666.pdf

Thanks 

Fande 


> On Mar 25, 2020, at 5:18 PM, Mark Adams  wrote:
> 
> 
> 
> 
>> On Wed, Mar 25, 2020 at 6:40 PM Fande Kong  wrote:
>>> 
>>> 
>>>> On Wed, Mar 25, 2020 at 12:18 PM Mark Adams  wrote:
>>>> Also, a better test is see where streams pretty much saturates, then run 
>>>> that many processors per node and do the same test by increasing the 
>>>> nodes. This will tell you how well your network communication is doing.
>>>> 
>>>> But this result has a lot of stuff in "network communication" that can be 
>>>> further evaluated. The worst thing about this, I would think, is that the 
>>>> partitioning is blind to the memory hierarchy of inter and intra node 
>>>> communication.
>>> 
>>> Hierarchical partitioning was designed for this purpose. 
>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/MatOrderings/MATPARTITIONINGHIERARCH.html#MATPARTITIONINGHIERARCH
>>> 
>> 
>> That's fantastic!
>>  
>> Fande,
>>  
>>> The next thing to do is run with an initial grid that puts one cell per 
>>> node and the do uniform refinement, until you have one cell per process 
>>> (eg, one refinement step using 8 processes per node), partition to get one 
>>> cell per process, then do uniform refinement to get a reasonable sized 
>>> local problem. Alas, this is not easy to do, but it is doable.
>>> 
>>>> On Wed, Mar 25, 2020 at 2:04 PM Mark Adams  wrote:
>>>> I would guess that you are saturating the memory bandwidth. After you make 
>>>> PETSc (make all) it will suggest that you test it (make test) and suggest 
>>>> that you run streams (make streams).
>>>> 
>>>> I see Matt answered but let me add that when you make streams you will 
>>>> seed the memory rate for 1,2,3, ... NP processes. If your machine is 
>>>> decent you should see very good speed up at the beginning and then it will 
>>>> start to saturate. You are seeing about 50% of perfect speedup at 16 
>>>> process. I would expect that you will see something similar with streams. 
>>>> Without knowing your machine, your results look typical.
>>>> 
>>>>> On Wed, Mar 25, 2020 at 1:05 PM Amin Sadeghi  
>>>>> wrote:
>>>>> Hi,
>>>>> 
>>>>> I ran KSP example 45 on a single node with 32 cores and 125GB memory 
>>>>> using 1, 16 and 32 MPI processes. Here's a comparison of the time spent 
>>>>> during KSP.solve:
>>>>> 
>>>>> - 1 MPI process: ~98 sec, speedup: 1X
>>>>> - 16 MPI processes: ~12 sec, speedup: ~8X
>>>>> - 32 MPI processes: ~11 sec, speedup: ~9X
>>>>> 
>>>>> Since the problem size is large enough (8M unknowns), I expected a 
>>>>> speedup much closer to 32X, rather than 9X. Is this expected? If yes, how 
>>>>> can it be improved?
>>>>> 
>>>>> I've attached three log files for more details. 
>>>>> 
>>>>> Sincerely,
>>>>> Amin


Re: [petsc-users] Poor speed up for KSP example 45

2020-03-25 Thread Fande Kong
On Wed, Mar 25, 2020 at 12:18 PM Mark Adams  wrote:

> Also, a better test is see where streams pretty much saturates, then run
> that many processors per node and do the same test by increasing the nodes.
> This will tell you how well your network communication is doing.
>
> But this result has a lot of stuff in "network communication" that can be
> further evaluated. The worst thing about this, I would think, is that the
> partitioning is blind to the memory hierarchy of inter and intra node
> communication.
>

Hierarchical partitioning was designed for this purpose.
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/MatOrderings/MATPARTITIONINGHIERARCH.html#MATPARTITIONINGHIERARCH

Fande,


> The next thing to do is run with an initial grid that puts one cell per
> node and the do uniform refinement, until you have one cell per process
> (eg, one refinement step using 8 processes per node), partition to get one
> cell per process, then do uniform refinement to get a reasonable sized
> local problem. Alas, this is not easy to do, but it is doable.
>
> On Wed, Mar 25, 2020 at 2:04 PM Mark Adams  wrote:
>
>> I would guess that you are saturating the memory bandwidth. After
>> you make PETSc (make all) it will suggest that you test it (make test) and
>> suggest that you run streams (make streams).
>>
>> I see Matt answered but let me add that when you make streams you will
>> seed the memory rate for 1,2,3, ... NP processes. If your machine is decent
>> you should see very good speed up at the beginning and then it will start
>> to saturate. You are seeing about 50% of perfect speedup at 16 process. I
>> would expect that you will see something similar with streams. Without
>> knowing your machine, your results look typical.
>>
>> On Wed, Mar 25, 2020 at 1:05 PM Amin Sadeghi 
>> wrote:
>>
>>> Hi,
>>>
>>> I ran KSP example 45 on a single node with 32 cores and 125GB memory
>>> using 1, 16 and 32 MPI processes. Here's a comparison of the time spent
>>> during KSP.solve:
>>>
>>> - 1 MPI process: ~98 sec, speedup: 1X
>>> - 16 MPI processes: ~12 sec, speedup: ~8X
>>> - 32 MPI processes: ~11 sec, speedup: ~9X
>>>
>>> Since the problem size is large enough (8M unknowns), I expected a
>>> speedup much closer to 32X, rather than 9X. Is this expected? If yes, how
>>> can it be improved?
>>>
>>> I've attached three log files for more details.
>>>
>>> Sincerely,
>>> Amin
>>>
>>


Re: [petsc-users] Rebuilding libmesh

2020-03-18 Thread Fande Kong
HI Lin,

Do you have a home-brew installed MPI?

"
configure:6076: mpif90 -v >&5
mpifort for MPICH version 3.3
Reading specs from
/home/lin/.linuxbrew/Cellar/gcc/5.5.0_7/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.5.0/specs
"

MOOSE environment package should carry everything you need: compiler, mpi,
and petsc.

You could home-brew uninstall your mpi, and retry.

Thanks,

Fande,

On Wed, Mar 18, 2020 at 5:57 PM Jed Brown  wrote:

> Alexander Lindsay  writes:
>
> > Does anyone have a suggestion for this compilation error from
> petscconf.h?
> > Sorry this is with a somewhat old PETSc version:
> >
> > configure:34535: checking whether we can compile a trivial PETSc program
> > configure:34564: mpicxx -c  -std=gnu++11
>
> What do you get with `mpicxx --version`?
>
> This is usually a result of configuring PETSc with a different compiler
> version than you use to run.
>
> > -I/opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include
> > -I/opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt//include
> > -I/opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include
> > -I/opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include  conftest.cpp
> >&5
> > In file included from
> > /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petscsys.h:14:0,
> >  from
> > /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petscbag.h:4,
> >  from
> > /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petsc.h:5,
> >  from conftest.cpp:144:
> >
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petscconf.h:85:36:
> > error: expected '}' before '__attribute'
> >  #define PETSC_DEPRECATED_ENUM(why) __attribute((deprecated))
> > ^
> >
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petscksp.h:430:76:
> > note: in expansion of macro 'PETSC_DEPRECATED_ENUM'
> >  #define KSP_DIVERGED_PCSETUP_FAILED_DEPRECATED
> KSP_DIVERGED_PCSETUP_FAILED
> > PETSC_DEPRECATED_ENUM("Use KSP_DIVERGED_PC_FAILED (since v3.11)")
> >
> > ^
> >
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt/include/petscksp.h:452:15:
> > note: in expansion of macro 'KSP_DIVERGED_PCSETUP_FAILED_DEPRECATED'
> >KSP_DIVERGED_PCSETUP_FAILED_DEPRECATED  = -11,
> >
> >
> > On Wed, Mar 18, 2020 at 2:55 PM Lin  wrote:
> >
> >> Hi, all,
> >>
> >>  I met a problem with
> >>
> >> error: *** PETSc was not found, but --enable-petsc-required was
> specified.
> >>
> >> when I reinstalled MOOSE. However, I had been using MOOSE with no issues
> >> previously. Does someone know how to solve it? My system is Ubuntu
> 18.04.
> >>
> >> The error is listed as following:
> >>
> >> Found valid MPI installation...
> >>
> >> note: using /opt/moose/mpich-3.3/gcc-9.2.0/include/mpi.h
> >>
> >> checking mpi.h usability... yes
> >>
> >> checking mpi.h presence... yes
> >>
> >> checking for mpi.h... yes
> >>
> >> checking
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt//include/petscversion.h
> >> usability... yes
> >>
> >> checking
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt//include/petscversion.h
> >> presence... yes
> >>
> >> checking for
> /opt/moose/petsc-3.11.4/mpich-3.3_gcc-9.2.0-opt//include/petscversion.h...
> >> yes
> >>
> >> <<< Found PETSc 3.11.4 installation in /opt/moose/petsc-3.11.4/mpich-3.3
> >> _gcc-9.2.0-opt ... >>>
> >>
> >> checking whether we can compile a trivial PETSc program... no
> >>
> >> checking for TAO support via PETSc... no
> >>
> >> configure: error: *** PETSc was not found, but --enable-petsc-required
> >> was specified.
> >> make: *** No targets specified and no makefile found.  Stop.
> >>
> >>
> >>
> >> Besides, I attached my libMesh configure log file in the email.
> >>
> >> Regards,
> >> Lin
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "moose-users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to moose-users+unsubscr...@googlegroups.com.
> >> To view this discussion on the web visit
> >>
> https://groups.google.com/d/msgid/moose-users/db12322c-eae6-4ed4-b54f-3ab5e118f466%40googlegroups.com
> >> <
> https://groups.google.com/d/msgid/moose-users/db12322c-eae6-4ed4-b54f-3ab5e118f466%40googlegroups.com?utm_medium=email_source=footer
> >
> >> .
> >>
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "moose-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to moose-users+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/moose-users/CANFcJrE%2BURQoK0UiqBEsB9yZ2Qbbj24W_S_n8qYzxOBtD41Yzw%40mail.gmail.com
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "moose-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to moose-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> 

Re: [petsc-users] --download-fblaslapack libraries cannot be used

2020-03-18 Thread Fande Kong
A PR here https://gitlab.com/petsc/petsc/-/merge_requests/2612

On Wed, Mar 18, 2020 at 3:35 PM Fande Kong  wrote:

> Thanks, Satish,
>
> I keep investigating into this issue. Now, I have more insights. The
> fundamental reason is that: Conda-compilers (installed by: conda install -c
> conda-forge compilers) have a bunch of system libs in */sysroot/lib and
> */sysroot/usr/lib. Most of them are related to glibc. These libs may or may
> not be compatible to the OS system you are using.
>
> PETSc will find these libs, and think they are just regular user libs, and
> then hard code with "-rpath".
>
> If I make some changes to ignore sysroot/lib*, and then everything
> runs smoothly ( do not need to install glibc because OS will have a right
> one).
>
>  git diff
> diff --git a/config/BuildSystem/config/compilers.py
> b/config/BuildSystem/config/compilers.py
> index 5367383141..6b26594b4e 100644
> --- a/config/BuildSystem/config/compilers.py
> +++ b/config/BuildSystem/config/compilers.py
> @@ -1132,6 +1132,7 @@ Otherwise you need a different combination of C,
> C++, and Fortran compilers")
>  if m:
>arg = '-L'+os.path.abspath(arg[2:])
>if arg in ['-L/usr/lib','-L/lib','-L/usr/lib64','-L/lib64']:
> continue
> +  if 'sysroot/usr/lib' in arg or 'sysroot/lib' in arg: continue
>if not arg in lflags:
>  lflags.append(arg)
>  self.logPrint('Found library directory: '+arg, 4, 'compilers')
>
>
> PETSc should treat these libs as system-level libs???
>
> Have a branch here: Fande-Kong/skip_sysroot_libs_maint
>
> Any suggestion are appreciated,
>
> Thanks,
>
> Fande,
>
>
> On Tue, Mar 17, 2020 at 2:17 PM Satish Balay  wrote:
>
>> Thanks for the update.
>>
>> Hopefully Matt can check on the issue with missing stuff in configure.log.
>>
>> The MR is at https://gitlab.com/petsc/petsc/-/merge_requests/2606
>>
>> Satish
>>
>>
>> On Tue, 17 Mar 2020, Fande Kong wrote:
>>
>> > On Tue, Mar 17, 2020 at 9:24 AM Satish Balay  wrote:
>> >
>> > > So what was the initial problem? Did conda install gcc without glibc?
>> Or
>> > > was it using the wrong glibc?
>> > >
>> >
>> > Looks like GCC installed by conda uses an old version of glibc (2.12).
>> >
>> >
>> > > Because the compiler appeared partly functional [well the build worked
>> > > with just LIBS="-lmpifort -lgfortran"]
>> > >
>> > > And after the correct glibc was installed - did current maint still
>> fail
>> > > to build?
>> > >
>> >
>> > Still failed because PETSc claimed that: there were no needed fortran
>> > libraries when using mpicc as the linker. But in fact, we need these
>> > fortran stuffs when linking blaslapack and mumps.
>> >
>> >
>> > >
>> > > Can you send configure.log for this?
>> > >
>> > > And its not clear to me why balay/fix-checkFortranLibraries/maint
>> broke
>> > > before this fix. [for one configure.log was incomplete]
>> > >
>> >
>> > I am not 100% sure, but I think the complied and linked executable can
>> not
>> > run because  of "glibc_2.14' not found". The version of glibc was too
>> low.
>> >
>> >
>> > So current solution for me is that: your branch + a new version of glibc
>> > (2.18).
>> >
>> > Thanks,
>> >
>> > Fande,
>> >
>> >
>> >
>> > >
>> > > Satish
>> > >
>> > > On Tue, 17 Mar 2020, Fande Kong wrote:
>> > >
>> > > > Hi Satish,
>> > > >
>> > > > Could you merge your branch, balay/fix-checkFortranLibraries/maint,
>> into
>> > > > maint?
>> > > >
>> > > > I added glibc to my conda environment (conda install -c
>> dan_blanchard
>> > > > glibc), and your branch ran well.
>> > > >
>> > > > If you are interested, I attached the successful log file here.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Fande
>> > > >
>> > > > On Sat, Mar 14, 2020 at 5:01 PM Fande Kong 
>> wrote:
>> > > >
>> > > > > Without touching the configuration file, the
>> > > > > option: --download-hypre-configure-arguments='LIBS="-lmpifort
>> > > -lgfortran"',
>

Re: [petsc-users] --download-fblaslapack libraries cannot be used

2020-03-18 Thread Fande Kong
Thanks, Satish,

I keep investigating into this issue. Now, I have more insights. The
fundamental reason is that: Conda-compilers (installed by: conda install -c
conda-forge compilers) have a bunch of system libs in */sysroot/lib and
*/sysroot/usr/lib. Most of them are related to glibc. These libs may or may
not be compatible to the OS system you are using.

PETSc will find these libs, and think they are just regular user libs, and
then hard code with "-rpath".

If I make some changes to ignore sysroot/lib*, and then everything
runs smoothly ( do not need to install glibc because OS will have a right
one).

 git diff
diff --git a/config/BuildSystem/config/compilers.py
b/config/BuildSystem/config/compilers.py
index 5367383141..6b26594b4e 100644
--- a/config/BuildSystem/config/compilers.py
+++ b/config/BuildSystem/config/compilers.py
@@ -1132,6 +1132,7 @@ Otherwise you need a different combination of C, C++,
and Fortran compilers")
 if m:
   arg = '-L'+os.path.abspath(arg[2:])
   if arg in ['-L/usr/lib','-L/lib','-L/usr/lib64','-L/lib64']:
continue
+  if 'sysroot/usr/lib' in arg or 'sysroot/lib' in arg: continue
   if not arg in lflags:
 lflags.append(arg)
 self.logPrint('Found library directory: '+arg, 4, 'compilers')


PETSc should treat these libs as system-level libs???

Have a branch here: Fande-Kong/skip_sysroot_libs_maint

Any suggestion are appreciated,

Thanks,

Fande,


On Tue, Mar 17, 2020 at 2:17 PM Satish Balay  wrote:

> Thanks for the update.
>
> Hopefully Matt can check on the issue with missing stuff in configure.log.
>
> The MR is at https://gitlab.com/petsc/petsc/-/merge_requests/2606
>
> Satish
>
>
> On Tue, 17 Mar 2020, Fande Kong wrote:
>
> > On Tue, Mar 17, 2020 at 9:24 AM Satish Balay  wrote:
> >
> > > So what was the initial problem? Did conda install gcc without glibc?
> Or
> > > was it using the wrong glibc?
> > >
> >
> > Looks like GCC installed by conda uses an old version of glibc (2.12).
> >
> >
> > > Because the compiler appeared partly functional [well the build worked
> > > with just LIBS="-lmpifort -lgfortran"]
> > >
> > > And after the correct glibc was installed - did current maint still
> fail
> > > to build?
> > >
> >
> > Still failed because PETSc claimed that: there were no needed fortran
> > libraries when using mpicc as the linker. But in fact, we need these
> > fortran stuffs when linking blaslapack and mumps.
> >
> >
> > >
> > > Can you send configure.log for this?
> > >
> > > And its not clear to me why balay/fix-checkFortranLibraries/maint broke
> > > before this fix. [for one configure.log was incomplete]
> > >
> >
> > I am not 100% sure, but I think the complied and linked executable can
> not
> > run because  of "glibc_2.14' not found". The version of glibc was too
> low.
> >
> >
> > So current solution for me is that: your branch + a new version of glibc
> > (2.18).
> >
> > Thanks,
> >
> > Fande,
> >
> >
> >
> > >
> > > Satish
> > >
> > > On Tue, 17 Mar 2020, Fande Kong wrote:
> > >
> > > > Hi Satish,
> > > >
> > > > Could you merge your branch, balay/fix-checkFortranLibraries/maint,
> into
> > > > maint?
> > > >
> > > > I added glibc to my conda environment (conda install -c dan_blanchard
> > > > glibc), and your branch ran well.
> > > >
> > > > If you are interested, I attached the successful log file here.
> > > >
> > > > Thanks,
> > > >
> > > > Fande
> > > >
> > > > On Sat, Mar 14, 2020 at 5:01 PM Fande Kong 
> wrote:
> > > >
> > > > > Without touching the configuration file, the
> > > > > option: --download-hypre-configure-arguments='LIBS="-lmpifort
> > > -lgfortran"',
> > > > > also works.
> > > > >
> > > > >
> > > > > Thanks, Satish,
> > > > >
> > > > >
> > > > > Fande,
> > > > >
> > > > > On Sat, Mar 14, 2020 at 4:37 PM Fande Kong 
> > > wrote:
> > > > >
> > > > >> OK. I finally got PETSc complied.
> > > > >>
> > > > >> "-lgfortran" was required by fblaslapack
> > > > >> "-lmpifort" was required by mumps.
> > > > >>
> > > > >> How

Re: [petsc-users] --download-fblaslapack libraries cannot be used

2020-03-14 Thread Fande Kong
Without touching the configuration file, the
option: --download-hypre-configure-arguments='LIBS="-lmpifort -lgfortran"',
also works.


Thanks, Satish,


Fande,

On Sat, Mar 14, 2020 at 4:37 PM Fande Kong  wrote:

> OK. I finally got PETSc complied.
>
> "-lgfortran" was required by fblaslapack
> "-lmpifort" was required by mumps.
>
> However, I had to manually add the same thing for hypre as well:
>
> git diff
> diff --git a/config/BuildSystem/config/packages/hypre.py
> b/config/BuildSystem/config/packages/hypre.py
> index 4d915c312f..f4300230a6 100644
> --- a/config/BuildSystem/config/packages/hypre.py
> +++ b/config/BuildSystem/config/packages/hypre.py
> @@ -66,6 +66,7 @@ class Configure(config.package.GNUPackage):
>  args.append('--with-lapack-lib=" "')
>  args.append('--with-blas=no')
>  args.append('--with-lapack=no')
> +args.append('LIBS="-lmpifort -lgfortran"')
>  if self.openmp.found:
>args.append('--with-openmp')
>self.usesopenmp = 'yes'
>
>
> Why hypre could not pick up LIBS options automatically?
>
>
> Thanks,
>
> Fande,
>
>
>
>
> On Sat, Mar 14, 2020 at 2:49 PM Satish Balay via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Configure Options: --configModules=PETSc.Configure
>> --optionsModule=config.compilerOptions --download-hypre=1
>> --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1
>> --download-metis=1 --download-ptscotch=1 --download-parmetis=1
>> --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1
>> --download-slepc=git://https://gitlab.com/slepc/slepc.git
>> --download-slepc-commit= 59ff81b --with-mpi=1 --with-cxx-dialect=C++11
>> --with-fortran-bindings=0 --with-sowing=0 CFLAGS=-march=nocona
>> -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2
>> -ffunction-sections -pipe -isystem
>> /home/kongf/workhome/rod/miniconda3/include CXXFLAGS= LDFLAGS=-Wl,-O2
>> -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now
>> -Wl,--with-new-dtags=0 -Wl,--gc-sections
>> -Wl,-rpath,/home/kongf/workhome/rod/miniconda3/lib
>> -Wl,-rpath-link,/home/kongf/workhome/rod/miniconda3/lib
>> -L/home/kongf/workhome/rod/miniconda3/lib
>> AR=/home/kongf/workhome/rod/miniconda3/bin/x86_64-conda_cos6-linux-gnu-ar
>> --with-mpi-dir=/home/kongf/workhome/rod/mpich LIBS=-lgfortran -lmpifort
>>
>> You are missing quotes with LIBS option - and likely the libraries in the
>> wrong order.
>>
>> Suggest using:
>>
>> LIBS="-lmpifort -lgfortran"
>> or
>> 'LIBS=-lmpifort -lgfortran'
>>
>> Assuming you are invoking configure from shell.
>>
>> Satish
>>
>> On Sat, 14 Mar 2020, Satish Balay via petsc-users wrote:
>>
>> > to work around - you can try:
>> >
>> > LIBS="-lmpifort -lgfortran"
>> >
>> > Satish
>> >
>> > On Sat, 14 Mar 2020, Satish Balay via petsc-users wrote:
>> >
>> > > Its the same location as before. For some reason configure is not
>> saving the relevant logs.
>> > >
>> > > I don't understand saveLog() restoreLog() stuff. Matt, can you check
>> on this?
>> > >
>> > > Satish
>> > >
>> > > On Sat, 14 Mar 2020, Fande Kong wrote:
>> > >
>> > > > The configuration crashed earlier than before with your changes.
>> > > >
>> > > > Please see the attached log file when using your branch. The
>> trouble lines
>> > > > should be:
>> > > >
>> > > >  "asub=self.mangleFortranFunction("asub")
>> > > > cbody = "extern void "+asub+"(void);\nint main(int argc,char
>> > > > **args)\n{\n  "+asub+"();\n  return 0;\n}\n";
>> > > > "
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Fande,
>> > > >
>> > > > On Thu, Mar 12, 2020 at 7:06 PM Satish Balay 
>> wrote:
>> > > >
>> > > > > I can't figure out what the stack in the attached configure.log.
>> [likely
>> > > > > some stuff isn't getting logged in it]
>> > > > >
>> > > > > Can you retry with branch 'balay/fix-checkFortranLibraries/maint'?
>> > > > >
>> > > > > Satish
>> > > > >
>> > > > > On Thu, 12 Mar 2020, Fande Kong wrote:
>> > > > >

Re: [petsc-users] --download-fblaslapack libraries cannot be used

2020-03-14 Thread Fande Kong
OK. I finally got PETSc complied.

"-lgfortran" was required by fblaslapack
"-lmpifort" was required by mumps.

However, I had to manually add the same thing for hypre as well:

git diff
diff --git a/config/BuildSystem/config/packages/hypre.py
b/config/BuildSystem/config/packages/hypre.py
index 4d915c312f..f4300230a6 100644
--- a/config/BuildSystem/config/packages/hypre.py
+++ b/config/BuildSystem/config/packages/hypre.py
@@ -66,6 +66,7 @@ class Configure(config.package.GNUPackage):
 args.append('--with-lapack-lib=" "')
 args.append('--with-blas=no')
 args.append('--with-lapack=no')
+args.append('LIBS="-lmpifort -lgfortran"')
 if self.openmp.found:
   args.append('--with-openmp')
   self.usesopenmp = 'yes'


Why hypre could not pick up LIBS options automatically?


Thanks,

Fande,




On Sat, Mar 14, 2020 at 2:49 PM Satish Balay via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Configure Options: --configModules=PETSc.Configure
> --optionsModule=config.compilerOptions --download-hypre=1
> --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1
> --download-metis=1 --download-ptscotch=1 --download-parmetis=1
> --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1
> --download-slepc=git://https://gitlab.com/slepc/slepc.git
> --download-slepc-commit= 59ff81b --with-mpi=1 --with-cxx-dialect=C++11
> --with-fortran-bindings=0 --with-sowing=0 CFLAGS=-march=nocona
> -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2
> -ffunction-sections -pipe -isystem
> /home/kongf/workhome/rod/miniconda3/include CXXFLAGS= LDFLAGS=-Wl,-O2
> -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now
> -Wl,--with-new-dtags=0 -Wl,--gc-sections
> -Wl,-rpath,/home/kongf/workhome/rod/miniconda3/lib
> -Wl,-rpath-link,/home/kongf/workhome/rod/miniconda3/lib
> -L/home/kongf/workhome/rod/miniconda3/lib
> AR=/home/kongf/workhome/rod/miniconda3/bin/x86_64-conda_cos6-linux-gnu-ar
> --with-mpi-dir=/home/kongf/workhome/rod/mpich LIBS=-lgfortran -lmpifort
>
> You are missing quotes with LIBS option - and likely the libraries in the
> wrong order.
>
> Suggest using:
>
> LIBS="-lmpifort -lgfortran"
> or
> 'LIBS=-lmpifort -lgfortran'
>
> Assuming you are invoking configure from shell.
>
> Satish
>
> On Sat, 14 Mar 2020, Satish Balay via petsc-users wrote:
>
> > to work around - you can try:
> >
> > LIBS="-lmpifort -lgfortran"
> >
> > Satish
> >
> > On Sat, 14 Mar 2020, Satish Balay via petsc-users wrote:
> >
> > > Its the same location as before. For some reason configure is not
> saving the relevant logs.
> > >
> > > I don't understand saveLog() restoreLog() stuff. Matt, can you check
> on this?
> > >
> > > Satish
> > >
> > > On Sat, 14 Mar 2020, Fande Kong wrote:
> > >
> > > > The configuration crashed earlier than before with your changes.
> > > >
> > > > Please see the attached log file when using your branch. The trouble
> lines
> > > > should be:
> > > >
> > > >  "asub=self.mangleFortranFunction("asub")
> > > > cbody = "extern void "+asub+"(void);\nint main(int argc,char
> > > > **args)\n{\n  "+asub+"();\n  return 0;\n}\n";
> > > > "
> > > >
> > > > Thanks,
> > > >
> > > > Fande,
> > > >
> > > > On Thu, Mar 12, 2020 at 7:06 PM Satish Balay 
> wrote:
> > > >
> > > > > I can't figure out what the stack in the attached configure.log.
> [likely
> > > > > some stuff isn't getting logged in it]
> > > > >
> > > > > Can you retry with branch 'balay/fix-checkFortranLibraries/maint'?
> > > > >
> > > > > Satish
> > > > >
> > > > > On Thu, 12 Mar 2020, Fande Kong wrote:
> > > > >
> > > > > > Thanks, Satish,
> > > > > >
> > > > > > But still have the problem. Please see the attached log file.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Fande.
> > > > > >
> > > > > > On Thu, Mar 12, 2020 at 3:42 PM Satish Balay 
> wrote:
> > > > > >
> > > > > > > Can you retry with the attached patch?
> > > > > > >
> > > > > > > BTW: Its best to use the latest patched version - i.e
> > > > > petsc-3.12.4.tar.gz
> > > >

Re: [petsc-users] --download-fblaslapack libraries cannot be used

2020-03-12 Thread Fande Kong
This did not help.  Made no difference.

Thanks,

Fande,

On Thu, Mar 12, 2020 at 1:50 PM Satish Balay  wrote:

> Does the attached patch make a difference?
>
> Satish
>
> On Thu, 12 Mar 2020, Satish Balay via petsc-users wrote:
>
> > For some reason - the fortran compiler libraries check worked fine
> without -lgfortran.
> >
> > But now - flbaslapack check is failing without it.
> >
> > To work arround - you can use option LIBS=-lgfortran
> >
> > Satish
> >
> > On Thu, 12 Mar 2020, Fande Kong wrote:
> >
> > > Hi All,
> > >
> > > I had an issue when configuring petsc on a linux machine. I have the
> > > following error message:
> > >
> > >  Compiling FBLASLAPACK; this may take several minutes
> > >
> > >
> ===
> > >
> > >TESTING: checkLib from
> > >
> config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:120)
> > >
> > >
> ***
> > >  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log
> for
> > > details):
> > >
> ---
> > > --download-fblaslapack libraries cannot be used
> > >
> ***
> > >
> > >
> > > The configuration log was attached.
> > >
> > > Thanks,
> > >
> > > Fande,
> > >
> >
>


Re: [petsc-users] Condition Number and GMRES iteration

2020-02-07 Thread Fande Kong
On Fri, Feb 7, 2020 at 11:43 AM Victor Eijkhout 
wrote:

>
>
> On , 2020Feb7, at 12:31, Mark Adams  wrote:
>
> BTW, one of my earliest talks, in grad school before I had any real
> results, was called "condition number does not matter”
>
>
> After you learn that the condition number gives an _upper_bound_ on the
> number of iterations, you learn that if a few eigenvalues are separated
> from a cluster of other eigenvalues, your number of iterations is 1 for
> each separated one, and then a bound based on the remaining cluster.
>
> (Condition number predicts a number of iterations based on Chebychev
> polynomials. Since the CG polynomials are optimal, they are at least as
> good as Chebychev. Hence the number of iterations is at most what you got
> from Chebychev, which is the condition number bound.)
>

I like this explanation for normal matrices. Thanks so much, Victor,


Fande,


>
> Victor.
>
>
>


Re: [petsc-users] Condition Number and GMRES iteration

2020-02-07 Thread Fande Kong
Thanks, Matt,

It is a great paper. According to the paper, here is my understanding: for
normal matrices, the eigenvalues of the matrix together with the
initial residual completely determine the GMRES convergence rate. For
non-normal matrices, eigenvalues are NOT the relevant quantities in
determining the behavior of GMRES.

What quantities we should look at for non-normal matrices? In other words,
how do we know one matrix is easier than others to solve?  Possibly they
are still open problems?!

Thanks,

Fande,

On Fri, Feb 7, 2020 at 6:51 AM Matthew Knepley  wrote:

> On Thu, Feb 6, 2020 at 7:37 PM Fande Kong  wrote:
>
>> Hi All,
>>
>> MOOSE team, Alex and I are working on some variable scaling techniques to
>> improve the condition number of the matrix of linear systems. The goal of
>> variable scaling is to make the diagonal of matrix as close to unity as
>> possible. After scaling (for certain example), the condition number of the
>> linear system is actually reduced, but the GMRES iteration does not
>> decrease at all.
>>
>> From my understanding, the condition number is the worst estimation for
>> GMRES convergence. That is, the GMRES iteration should not increases when
>> the condition number decreases. This actually could example what we saw:
>> the improved condition number does not necessary lead to a decrease in
>> GMRES iteration. We try to understand this a bit more, and we guess that
>> the number of eigenvalue clusters of the matrix of the linear system
>> may/might be related to the convergence rate of GMRES.  We plot eigenvalues
>> of scaled system and unscaled system, and the clusters look different from
>> each other, but the GMRRES iterations are the same.
>>
>> Anyone know what is the right relationship between the condition number
>> and GMRES iteration? How does the number of eigenvalue clusters affect
>> GMRES iteration?  How to count eigenvalue clusters? For example, how many
>> eigenvalue clusters we have in the attach image respectively?
>>
>> If you need more details, please let us know. Alex and I are happy to
>> provide any details you are interested in.
>>
>
> Hi Fande,
>
> This is one of my favorite papers of all time:
>
>   https://epubs.siam.org/doi/abs/10.1137/S0895479894275030
>
> It shows that the spectrum alone tells you nothing at all about GMRES
> convergence. You need other things, like symmetry (almost
> everything is known) or normality (a little bit is known).
>
>   Thanks,
>
>   Matt
>
>
>> Thanks,
>>
>> Fande Kong,
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


[petsc-users] Condition Number and GMRES iteration

2020-02-06 Thread Fande Kong
Hi All,

MOOSE team, Alex and I are working on some variable scaling techniques to
improve the condition number of the matrix of linear systems. The goal of
variable scaling is to make the diagonal of matrix as close to unity as
possible. After scaling (for certain example), the condition number of the
linear system is actually reduced, but the GMRES iteration does not
decrease at all.

>From my understanding, the condition number is the worst estimation for
GMRES convergence. That is, the GMRES iteration should not increases when
the condition number decreases. This actually could example what we saw:
the improved condition number does not necessary lead to a decrease in
GMRES iteration. We try to understand this a bit more, and we guess that
the number of eigenvalue clusters of the matrix of the linear system
may/might be related to the convergence rate of GMRES.  We plot eigenvalues
of scaled system and unscaled system, and the clusters look different from
each other, but the GMRRES iterations are the same.

Anyone know what is the right relationship between the condition number and
GMRES iteration? How does the number of eigenvalue clusters affect GMRES
iteration?  How to count eigenvalue clusters? For example, how many
eigenvalue clusters we have in the attach image respectively?

If you need more details, please let us know. Alex and I are happy to
provide any details you are interested in.


Thanks,

Fande Kong,


Re: [petsc-users] Fwd: Running moose/scripts/update_and_rebuild_petsc.sh on HPC

2020-01-30 Thread Fande Kong
Bring conversation to the MOOSE list as well.

Fande,

On Thu, Jan 30, 2020 at 3:26 PM Satish Balay via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Pushed one more change - move duplicate/similar code into a function.
>
> Satish
>
> On Thu, 30 Jan 2020, Satish Balay via petsc-users wrote:
>
> > Ah - missed that part. I've updated the branch/MR.
> >
> > Thanks!
> > Satish
> >
> > On Thu, 30 Jan 2020, Tomas Mondragon wrote:
> >
> > > Just to be extra safe, that fix should also be applied to the
> > > 'with-executables-search-path' section as well, but your fix did help
> me
> > > get past the checks for lgrind and c2html.
> > >
> > > On Thu, Jan 30, 2020, 3:47 PM Satish Balay  wrote:
> > >
> > > > I pushed a fix to branch balay/fix-check-files-in-path - please give
> it a
> > > > try.
> > > >
> > > > https://gitlab.com/petsc/petsc/-/merge_requests/2490
> > > >
> > > > Satish
> > > >
> > > > On Thu, 30 Jan 2020, Satish Balay via petsc-users wrote:
> > > >
> > > > > The issue is:
> > > > >
> > > > > >>>
> > > > > [Errno 13] Permission denied: '/pbs/SLB'
> > > > > <<<
> > > > >
> > > > > Try removing this from PATH - and rerun configure.
> > > > >
> > > > > This part of configure code should be fixed.. [or protected with
> 'try']
> > > > >
> > > > > Satish
> > > > >
> > > > > On Thu, 30 Jan 2020, Fande Kong wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > It looks like a bug for me.
> > > > > >
> > > > > > PETSc was still trying to detect lgrind even we set
> "--with-lgrind=0".
> > > > The
> > > > > > configuration log is attached. Any way to disable lgrind
> detection.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Fande
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > -- Forwarded message -
> > > > > > From: Tomas Mondragon 
> > > > > > Date: Thu, Jan 30, 2020 at 9:54 AM
> > > > > > Subject: Re: Running moose/scripts/update_and_rebuild_petsc.sh
> on HPC
> > > > > > To: moose-users 
> > > > > >
> > > > > >
> > > > > > Configuration log is attached
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
>
>


[petsc-users] Fwd: Moose install troubleshooting help

2020-01-02 Thread Fande Kong
Satish,

Do you have any suggestions for this?

Chris,

It may be helpful if you could share the petsc configuration log file with
us?



Fande,

-- Forwarded message -
From: Chris Thompson 
Date: Tue, Dec 31, 2019 at 9:53 AM
Subject: Moose install troubleshooting help
To: moose-users 


Dear All,
I could use some help or a pointer in the right direction.

I have been following the directions at
https://mooseframework.inl.gov/getting_started/installation/hpc_install_moose.html
All was going well until "make PETSC_DIR=$STACK_SRC/petsc-3.11.4
PETSC_ARCH=linux-opt install".  I tried running "make --debug=v
PETSC_DIR=$STACK_SRC/petsc-3.11.4 PETSC_ARCH=linux-opt install", but that
didn't produce any more helpful information.

This on a CentOS system running 7.6

Here is the error / output I am getting.

rogue /usr/local/neapps/moose/stack_temp/petsc-3.11.4 926$ make --debug=v
PETSC_DIR=/usr/local/neapps/moose/stack_temp/petsc-3.11.4
PETSC_ARCH=linux-opt install
GNU Make 3.82
Built for x86_64-redhat-linux-gnu
Copyright (C) 2010  Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
Reading makefile `makefile'...
Reading makefile `linux-opt/lib/petsc/conf/petscvariables' (search path)
(no ~ expansion)...
Reading makefile
`/usr/local/neapps/moose/stack_temp/petsc-3.11.4/lib/petsc/conf/variables'
(search path) (no ~ expansion)...
Reading makefile
`/usr/local/neapps/moose/stack_temp/petsc-3.11.4/linux-opt/lib/petsc/conf/petscvariables'
(search path) (no ~ expansion)...
Reading makefile
`/usr/local/neapps/moose/stack_temp/petsc-3.11.4/lib/petsc/conf/rules'
(search path) (no ~ expansion)...
Reading makefile
`/usr/local/neapps/moose/stack_temp/petsc-3.11.4/linux-opt/lib/petsc/conf/petscrules'
(search path) (no ~ expansion)...
Reading makefile
`/usr/local/neapps/moose/stack_temp/petsc-3.11.4/lib/petsc/conf/test.common'
(search path) (no ~ expansion)...
Updating goal targets
Considering target file `install'.
File `install' does not exist.
 Finished prerequisites of target file `install'.
Must remake target `install'.
Invoking recipe from makefile:250 to update target `install'.
*** Using PETSC_DIR=/usr/local/neapps/moose/stack_temp/petsc-3.11.4
PETSC_ARCH=linux-opt ***
*** Installing PETSc at prefix location:
/usr/local/neapps/moose/petsc-3.11.4  ***
Traceback (most recent call last):
  File "./config/install.py", line 434, in 
Installer(sys.argv[1:]).run()
  File "./config/install.py", line 428, in run
self.runcopy()
  File "./config/install.py", line 407, in runcopy
self.installIncludes()
  File "./config/install.py", line 305, in installIncludes
self.copies.extend(self.copytree(self.rootIncludeDir,
self.destIncludeDir,exclude = exclude))
  File "./config/install.py", line 246, in copytree
raise shutil.Error(errors)
shutil.Error:
['/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc',
'/usr/local/neapps/moose/petsc-3.11.4/include/petsc',
 '[\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/private\',
\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private\',
 
\'[\\\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/private/kernels\\\',
 \\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private/kernels\\\',
 
\\\'[\\\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/private/kernels\\\',
 
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private/kernels\\\',
 "[Errno 1] Operation not permitted:
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private/kernels\\\'"]\\\',
 \\\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/private\\\',
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private\\\',
 "[Errno 1] Operation not permitted:
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/private\\\'"]\',
 \'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/finclude\',
\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/finclude\',
 
\'[\\\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc/finclude\\\',
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/finclude\\\',
 "[Errno 1] Operation not permitted:
\\\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc/finclude\\\'"]\',
\'/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include/petsc\',
 \'/usr/local/neapps/moose/petsc-3.11.4/include/petsc\', "[Errno 1]
Operation not permitted:
\'/usr/local/neapps/moose/petsc-3.11.4/include/petsc\'"]',
 '/usr/local/neapps/moose/stack_temp/petsc-3.11.4/include',
'/usr/local/neapps/moose/petsc-3.11.4/include', "[Errno 1] Operation not
permitted: '/usr/local/neapps/moose/petsc-3.11.4/include'"]
make: *** [install] Error 1

I'm not sure how to proceed with this error.

Thank you,
Chris

-- 
You received this message because you are subscribed to the Google Groups

Re: [petsc-users] Running moose/scripts/update_and_rebuild_petsc.sh on HPC

2019-12-19 Thread Fande Kong
Did you try "--with-batch=1"? A suggestion was proposed by Satish earlier
(CCing here).

Fande,

On Wed, Dec 18, 2019 at 12:36 PM Tomas Mondragon <
tom.alex.mondra...@gmail.com> wrote:

> Yes, but now that I have tried this a couple of different ways with
> different --with-mpiexec options, I am beginning to suspect that I need to
> run this as a PBS job on compute nodes rather than a shell script on login
> nodes. Also, running it with --with-mpiexec="mpirun -n 1" and with
> --with-mpiexec="path/to/mpirun" don't get me results any different from
> using --with-mpiexec="mpirun"
>
> I have attached my modified update_and_rebuild_petsc.sh scripts from both
> machines plus their respective configure.log files.
>
> --
> You received this message because you are subscribed to the Google Groups
> "moose-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to moose-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/moose-users/641b2c64-0e88-47e7-b33a-e2528287095a%40googlegroups.com
> 
> .
>


Re: [petsc-users] Running moose/scripts/update_and_rebuild_petsc.sh on HPC

2019-12-17 Thread Fande Kong
Are you able to run your MPI code using " mpiexec_mpt -n 1 ./yourbinary"?
You need to use --with-mpiexec to specify what exactly command lines you
can run, e.g., --with-mpiexec="mpirun -n 1".

I am also CCing the email to PETSc guys who may know the answer to these
questions.

Thanks,

Fande,

On Mon, Dec 16, 2019 at 3:52 PM Tomas Mondragon <
tom.alex.mondra...@gmail.com> wrote:

> I have attached the configure.log file from when I ran
> update_and_rebuild_petsc.sh on both machines. In the log from jim is from
> an attempt to rebuild after adding --with-mpiexec=mpiexec_mpt to the call
> to pets/configure in the update and rebuild script
>
> --
> You received this message because you are subscribed to the Google Groups
> "moose-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to moose-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/moose-users/edd3a169-5012-4e46-a060-5088a199603c%40googlegroups.com
> 
> .
>


Re: [petsc-users] Edge-cut information for CHACO

2019-12-05 Thread Fande Kong
On Thu, Dec 5, 2019 at 12:34 PM Mark Adams  wrote:

>
>
> On Thu, Dec 5, 2019 at 11:20 AM Eda Oktay  wrote:
>
>> Hello all,
>>
>> I am trying to find edge cut information of ParMETIS and CHACO. When I
>> use ParMETIS,
>> MatPartitioningViewImbalance(part,partitioning)
>> works and it gives also number of cuts.
>>
>> However, when I used CHACO, it only gives imbalance information, not edge
>> cut. I have index sets but I couldn't find how to calculate edge cut.
>>
>
> I've never heard of edge-cuts wrt Chaco. I'm sure it does not collect that
> information but you could look at the code.
>

Not sure chaco supports this or not, but we do not collect edge cuts from
Chaco. If you want, it is actually very easy to collect all these data at
the PETSc side.


>
>
>>
>> Also, does ParMETIS calculate edge cuts according to the values of
>> weights or number of weights?
>>
>
> Good question, I would assume this is an integer, the number of edge cuts,
> and not the sum of the weights. If it prints and integer then that would be
> a hint.
>

The edge-cuts in ParMETIS take the edge weights into account.

For example, if you have an edge along which the domain is partitioned. If
the weight of that edge is 10, and the number of edge-cuts is 10 instead of
1.


Fande,


>
>>
>> Thanks!
>>
>> Eda
>>
>


Re: [petsc-users] Select a preconditioner for SLEPc eigenvalue solver Jacobi-Davidson

2019-11-05 Thread Fande Kong via petsc-users
How about I want to determine the ST type on runtime?

 mpirun -n 1 ./ex3  -eps_type jd -st_ksp_type gmres  -st_pc_type none
-eps_view  -eps_target  0 -eps_monitor  -st_ksp_monitor

ST is indeed STPrecond, but the passed preconditioning matrix is still
ignored.

EPS Object: 1 MPI processes
  type: jd
search subspace is orthogonalized
block size=1
type of the initial subspace: non-Krylov
size of the subspace after restarting: 6
number of vectors after restarting from the previous iteration: 1
threshold for changing the target in the correction equation (fix): 0.01
  problem type: symmetric eigenvalue problem
  selected portion of the spectrum: closest to target: 0. (in magnitude)
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 17
  maximum dimension of projected problem (mpd): 17
  maximum number of iterations: 1700
  tolerance: 1e-08
  convergence test: relative to the eigenvalue
BV Object: 1 MPI processes
  type: svec
  17 columns of global length 100
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: GS
  doing matmult as a single matrix-matrix product
DS Object: 1 MPI processes
  type: hep
  solving the problem with: Implicit QR method (_steqr)
ST Object: 1 MPI processes
  type: precond
  shift: 0.
  number of matrices: 1
  KSP Object: (st_) 1 MPI processes
type: gmres
  restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
  happy breakdown tolerance 1e-30
maximum iterations=90, initial guess is zero
tolerances:  relative=0.0001, absolute=1e-50, divergence=1.
left preconditioning
using PRECONDITIONED norm type for convergence test
  PC Object: (st_) 1 MPI processes
type: none
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
  type: shell
  rows=100, cols=100
 Solution method: jd


Preconding matrix should be a SeqAIJ not shell.


Fande,

On Tue, Nov 5, 2019 at 9:07 AM Jose E. Roman  wrote:

> Currently, the function that passes the preconditioner matrix is specific
> of STPRECOND, so you have to add
>   ierr = STSetType(st,STPRECOND);CHKERRQ(ierr);
> before
>   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> otherwise this latter call is ignored.
>
> We may be changing a little bit the way in which ST is initialized, and
> maybe we modify this as well. It is not decided yet.
>
> Jose
>
>
> > El 5 nov 2019, a las 0:28, Fande Kong  escribió:
> >
> > Thanks Jose,
> >
> > I think I understand now. Another question: what is the right way to
> setup a linear preconditioning matrix for the inner linear solver of JD?
> >
> > I was trying to do something like this:
> >
> >   /*
> >  Create eigensolver context
> >   */
> >   ierr = EPSCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
> >
> >   /*
> >  Set operators. In this case, it is a standard eigenvalue problem
> >   */
> >   ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
> >   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
> >   ierr = EPSGetST(eps,);CHKERRQ(ierr);
> >   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> >
> >   /*
> >  Set solver parameters at runtime
> >   */
> >   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> >
> >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >   Solve the eigensystem
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> */
> >
> >   ierr = EPSSolve(eps);CHKERRQ(ierr);
> >
> >
> > But did not work. A complete example is attached.  I could try to dig
> into the code, but you may already know the answer.
> >
> >
> > On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman 
> wrote:
> > Yes, it is confusing. Here is the explanation: when you use a target,
> the preconditioner is built from matrix A-sigma*B. By default, instead of
> TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat
> this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner
> is built from matrix B. The thing is that in a standard eigenproblem we
> have B=I, and hence there is no point in using a preconditioner, that is
> why we set PCNONE.
> >
> > Jose
> >
> >
> > > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <
> petsc-users@mcs.anl.gov> escribió:
> > >
> > > Hi All,
> > >
> > > It looks like the preconditioner is hard-coded in the Jacobi-Davidson
> solver. I could not select a preconditioner rather than the default setting.
> > >
> > > For example, I was trying to select LU,

Re: [petsc-users] Problem about Residual evaluation

2019-04-01 Thread Fande Kong via petsc-users
On Mon, Apr 1, 2019 at 10:24 AM Matthew Knepley via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> On Mon, Apr 1, 2019 at 10:22 AM Yingjie Wu via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Dear PETSc developers:
>> Hi,
>>
>> I've been using -snes_mf_operator and I've customized a precondition
>> matrix to solve my problem.I have two questions about the residuals of
>> linear steps(KSP residual).
>>
>>
>> 1.Since I'm using a matrix-free method, how do we get KSP residuals in
>> PETSc?
>>
>> r_m = b - A*x_m
>>
>> Is finite difference used to approximate "A*x_m" ?
>>
>
> Yes.
>
>
>> 2.What is the difference between instruction ' -ksp_monitor ' and '
>> -ksp_monitor_true_residual ' in how they are calculated?
>>
>
> The true residual is the unpreconditioned residual.
>

I actually have some specific understanding on " -ksp_monitor_true_residual",
but not sure it is right or not.  If I am wrong, please correct me.

When the preconditioning  matrix is super ill-conditioned, the ``true
residual" is not necessary ``true"  for the right preconditioning since an
unwind process is applied.  That is,   "-ksp_monitor_true_residual" does
not print ||b-Ax||, instead, it prints  the unpreconditioned residual by
unpreconditioning  the preconditioned residual.


Thanks,

Fande,



>
>   Matt
>
>
>> Thanks,
>>
>> Yingjie
>>
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


[petsc-users] How to use khash in PETSc?

2019-03-24 Thread Fande Kong via petsc-users
Hi All,

Since PetscTable will be replaced by khash in the future somehow,  it is
better to use khash for new implementations. I was wondering where I can
find some examples that use khash? Do we have any petsc wrappers of khash?

Thanks,

Fande,


Re: [petsc-users] MPI Iterative solver crash on HPC

2019-01-14 Thread Fande Kong via petsc-users
Hi Hong,

According to this PR
https://bitbucket.org/petsc/petsc/pull-requests/1061/a_selinger-feature-faster-scalable/diff

Should we set the scalable algorithm as default?

Thanks,

Fande Kong,

On Fri, Jan 11, 2019 at 10:34 AM Zhang, Hong via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Add option '-mattransposematmult_via scalable'
> Hong
>
> On Fri, Jan 11, 2019 at 9:52 AM Zhang, Junchao via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> I saw the following error message in your first email.
>>
>> [0]PETSC ERROR: Out of memory. This could be due to allocating
>> [0]PETSC ERROR: too large an object or bleeding by not properly
>> [0]PETSC ERROR: destroying unneeded objects.
>>
>> Probably the matrix is too large. You can try with more compute nodes,
>> for example, use 8 nodes instead of 2, and see what happens.
>>
>> --Junchao Zhang
>>
>>
>> On Fri, Jan 11, 2019 at 7:45 AM Sal Am via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Using a larger problem set with 2B non-zero elements and a matrix of 25M
>>> x 25M I get the following error:
>>> [4]PETSC ERROR:
>>> 
>>> [4]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> [4]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> [4]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>>> OS X to find memory corruption errors
>>> [4]PETSC ERROR: likely location of problem given in stack below
>>> [4]PETSC ERROR: -  Stack Frames
>>> 
>>> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> [4]PETSC ERROR:   INSTEAD the line number of the start of the
>>> function
>>> [4]PETSC ERROR:   is given.
>>> [4]PETSC ERROR: [4] MatCreateSeqAIJWithArrays line 4422
>>> /lustre/home/vef002/petsc/src/mat/impls/aij/seq/aij.c
>>> [4]PETSC ERROR: [4] MatMatMultSymbolic_SeqAIJ_SeqAIJ line 747
>>> /lustre/home/vef002/petsc/src/mat/impls/aij/seq/matmatmult.c
>>> [4]PETSC ERROR: [4]
>>> MatTransposeMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable line 1256
>>> /lustre/home/vef002/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c
>>> [4]PETSC ERROR: [4] MatTransposeMatMult_MPIAIJ_MPIAIJ line 1156
>>> /lustre/home/vef002/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c
>>> [4]PETSC ERROR: [4] MatTransposeMatMult line 9950
>>> /lustre/home/vef002/petsc/src/mat/interface/matrix.c
>>> [4]PETSC ERROR: [4] PCGAMGCoarsen_AGG line 871
>>> /lustre/home/vef002/petsc/src/ksp/pc/impls/gamg/agg.c
>>> [4]PETSC ERROR: [4] PCSetUp_GAMG line 428
>>> /lustre/home/vef002/petsc/src/ksp/pc/impls/gamg/gamg.c
>>> [4]PETSC ERROR: [4] PCSetUp line 894
>>> /lustre/home/vef002/petsc/src/ksp/pc/interface/precon.c
>>> [4]PETSC ERROR: [4] KSPSetUp line 304
>>> /lustre/home/vef002/petsc/src/ksp/ksp/interface/itfunc.c
>>> [4]PETSC ERROR: - Error Message
>>> --
>>> [4]PETSC ERROR: Signal received
>>> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for trouble shooting.
>>> [4]PETSC ERROR: Petsc Release Version 3.10.2, unknown
>>> [4]PETSC ERROR: ./solveCSys on a linux-cumulus-debug named r02g03 by
>>> vef002 Fri Jan 11 09:13:23 2019
>>> [4]PETSC ERROR: Configure options PETSC_ARCH=linux-cumulus-debug
>>> --with-cc=/usr/local/depot/openmpi-3.1.1-gcc-7.3.0/bin/mpicc
>>> --with-fc=/usr/local/depot/openmpi-3.1.1-gcc-7.3.0/bin/mpifort
>>> --with-cxx=/usr/local/depot/openmpi-3.1.1-gcc-7.3.0/bin/mpicxx
>>> --download-parmetis --download-metis --download-ptscotch
>>> --download-superlu_dist --download-mumps --with-scalar-type=complex
>>> --with-debugging=yes --download-scalapack --download-superlu
>>> --download-fblaslapack=1 --download-cmake
>>> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>
>>> --
>>> MPI_ABORT was invoked on rank 4 in communicator MPI_COMM_WORLD
>>> with errorcode 59.
>>>
>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> You may or 

Re: [petsc-users] Any reason for API change: DMGetWorkArray()

2019-01-10 Thread Fande Kong via petsc-users
OK...,

Thanks for the words.

Fande,

On Thu, Jan 10, 2019 at 3:36 PM Matthew Knepley  wrote:

> On Thu, Jan 10, 2019 at 5:31 PM Fande Kong  wrote:
>
>> Thanks, Matt,
>>
>> And then what is the reason to remove PetscDataType? I am out of
>> curiosity.
>>
>
> Occam's Razor: "one should not increase, beyond what is necessary, the
> number of entities required to explain anything"
>
>Matt
>
>
>> DMGetWorkArray is a little misleading. This may/might make people think
>> the routine is related to MPI, but it does not have anything to do with MPI.
>>
>> Thanks,
>>
>> Fande,
>>
>> On Thu, Jan 10, 2019 at 3:22 PM Matthew Knepley 
>> wrote:
>>
>>> We are trying to eliminate PetscDataType.
>>>
>>>   Matt
>>>
>>> On Thu, Jan 10, 2019 at 5:10 PM Fande Kong via petsc-users <
>>> petsc-users@mcs.anl.gov> wrote:
>>>
>>>> Hi All,
>>>>
>>>> The second parameter is changed from PetscDataType to MPI_Datatype
>>>> starting from PETSc-3.9.x
>>>>
>>>> Thanks,
>>>>
>>>> Fande Kong,
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


[petsc-users] Any reason for API change: DMGetWorkArray()

2019-01-10 Thread Fande Kong via petsc-users
Hi All,

The second parameter is changed from PetscDataType to MPI_Datatype starting
from PETSc-3.9.x

Thanks,

Fande Kong,


Re: [petsc-users] GAMG scaling

2018-12-21 Thread Fande Kong via petsc-users
Sorry, hit the wrong button.



On Fri, Dec 21, 2018 at 7:56 PM Fande Kong  wrote:

>
>
> On Fri, Dec 21, 2018 at 9:44 AM Mark Adams  wrote:
>
>> Also, you mentioned that you are using 10 levels. This is very strange
>> with GAMG. You can run with -info and grep on GAMG to see the sizes and the
>> number of non-zeros per level. You should coarsen at a rate of about 2^D to
>> 3^D with GAMG (with 10 levels this would imply a very large fine grid
>> problem so I suspect there is something strange going on with coarsening).
>> Mark
>>
>
> Hi Mark,
>
>
Thanks for your email. We did not try GAMG much for our problems since we
still have troubles to figure out how to effectively use GAMG so far.
Instead, we are building our own customized  AMG  that needs to use PtAP to
construct coarse matrices.  The customized AMG works pretty well for our
specific simulations. The bottleneck right now is that PtAP might
take too much memory, and the code crashes within the function "PtAP". I
defiantly need a memory profiler to confirm my statement here.

Thanks,

Fande Kong,



>
>
>
>>
>> On Fri, Dec 21, 2018 at 11:36 AM Zhang, Hong via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Fande:
>>> I will explore it and get back to you.
>>> Does anyone know how to profile memory usage?
>>> Hong
>>>
>>> Thanks, Hong,
>>>>
>>>> I just briefly went through the code. I was wondering if it is possible
>>>> to destroy "c->ptap" (that caches a lot of intermediate data) to release
>>>> the memory after the coarse matrix is assembled. I understand you may still
>>>> want to reuse these data structures by default but for my simulation, the
>>>> preconditioner is fixed and there is no reason to keep the "c->ptap".
>>>>
>>>
>>>> It would be great, if we could have this optional functionality.
>>>>
>>>> Fande Kong,
>>>>
>>>> On Thu, Dec 20, 2018 at 9:45 PM Zhang, Hong  wrote:
>>>>
>>>>> We use nonscalable implementation as default, and switch to scalable
>>>>> for matrices over finer grids. You may use option '-matptap_via scalable'
>>>>> to force scalable PtAP  implementation for all PtAP. Let me know if it
>>>>> works.
>>>>> Hong
>>>>>
>>>>> On Thu, Dec 20, 2018 at 8:16 PM Smith, Barry F. 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>   See MatPtAP_MPIAIJ_MPIAIJ(). It switches to scalable automatically
>>>>>> for "large" problems, which is determined by some heuristic.
>>>>>>
>>>>>>Barry
>>>>>>
>>>>>>
>>>>>> > On Dec 20, 2018, at 6:46 PM, Fande Kong via petsc-users <
>>>>>> petsc-users@mcs.anl.gov> wrote:
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Dec 20, 2018 at 4:43 PM Zhang, Hong 
>>>>>> wrote:
>>>>>> > Fande:
>>>>>> > Hong,
>>>>>> > Thanks for your improvements on PtAP that is critical for MG-type
>>>>>> algorithms.
>>>>>> >
>>>>>> > On Wed, May 3, 2017 at 10:17 AM Hong  wrote:
>>>>>> > Mark,
>>>>>> > Below is the copy of my email sent to you on Feb 27:
>>>>>> >
>>>>>> > I implemented scalable MatPtAP and did comparisons of three
>>>>>> implementations using ex56.c on alcf cetus machine (this machine has 
>>>>>> small
>>>>>> memory, 1GB/core):
>>>>>> > - nonscalable PtAP: use an array of length PN to do dense axpy
>>>>>> > - scalable PtAP:   do sparse axpy without use of PN array
>>>>>> >
>>>>>> > What PN means here?
>>>>>> > Global number of columns of P.
>>>>>> >
>>>>>> > - hypre PtAP.
>>>>>> >
>>>>>> > The results are attached. Summary:
>>>>>> > - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre
>>>>>> PtAP
>>>>>> > - scalable PtAP is 4x faster than hypre PtAP
>>>>>> > - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>>>> >
>>>>>> > I was wondering how much more m

Re: [petsc-users] GAMG scaling

2018-12-21 Thread Fande Kong via petsc-users
Thanks so much, Hong,

If any new finding, please let me know.


On Fri, Dec 21, 2018 at 9:36 AM Zhang, Hong  wrote:

> Fande:
> I will explore it and get back to you.
> Does anyone know how to profile memory usage?
>

We are using gperftools
https://gperftools.github.io/gperftools/heapprofile.html

Fande,



> Hong
>
> Thanks, Hong,
>>
>> I just briefly went through the code. I was wondering if it is possible
>> to destroy "c->ptap" (that caches a lot of intermediate data) to release
>> the memory after the coarse matrix is assembled. I understand you may still
>> want to reuse these data structures by default but for my simulation, the
>> preconditioner is fixed and there is no reason to keep the "c->ptap".
>>
>
>> It would be great, if we could have this optional functionality.
>>
>> Fande Kong,
>>
>> On Thu, Dec 20, 2018 at 9:45 PM Zhang, Hong  wrote:
>>
>>> We use nonscalable implementation as default, and switch to scalable for
>>> matrices over finer grids. You may use option '-matptap_via scalable' to
>>> force scalable PtAP  implementation for all PtAP. Let me know if it works.
>>> Hong
>>>
>>> On Thu, Dec 20, 2018 at 8:16 PM Smith, Barry F. 
>>> wrote:
>>>
>>>>
>>>>   See MatPtAP_MPIAIJ_MPIAIJ(). It switches to scalable automatically
>>>> for "large" problems, which is determined by some heuristic.
>>>>
>>>>Barry
>>>>
>>>>
>>>> > On Dec 20, 2018, at 6:46 PM, Fande Kong via petsc-users <
>>>> petsc-users@mcs.anl.gov> wrote:
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Dec 20, 2018 at 4:43 PM Zhang, Hong 
>>>> wrote:
>>>> > Fande:
>>>> > Hong,
>>>> > Thanks for your improvements on PtAP that is critical for MG-type
>>>> algorithms.
>>>> >
>>>> > On Wed, May 3, 2017 at 10:17 AM Hong  wrote:
>>>> > Mark,
>>>> > Below is the copy of my email sent to you on Feb 27:
>>>> >
>>>> > I implemented scalable MatPtAP and did comparisons of three
>>>> implementations using ex56.c on alcf cetus machine (this machine has small
>>>> memory, 1GB/core):
>>>> > - nonscalable PtAP: use an array of length PN to do dense axpy
>>>> > - scalable PtAP:   do sparse axpy without use of PN array
>>>> >
>>>> > What PN means here?
>>>> > Global number of columns of P.
>>>> >
>>>> > - hypre PtAP.
>>>> >
>>>> > The results are attached. Summary:
>>>> > - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre
>>>> PtAP
>>>> > - scalable PtAP is 4x faster than hypre PtAP
>>>> > - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>> >
>>>> > I was wondering how much more memory PETSc PtAP uses than hypre? I am
>>>> implementing an AMG algorithm based on PETSc right now, and it is working
>>>> well. But we find some a bottleneck with PtAP. For the same P and A, PETSc
>>>> PtAP fails to generate a coarse matrix due to out of memory, while hypre
>>>> still can generates the coarse matrix.
>>>> >
>>>> > I do not want to just use the HYPRE one because we had to duplicate
>>>> matrices if I used HYPRE PtAP.
>>>> >
>>>> > It would be nice if you guys already have done some compassions on
>>>> these implementations for the memory usage.
>>>> > Do you encounter memory issue with  scalable PtAP?
>>>> >
>>>> > By default do we use the scalable PtAP?? Do we have to specify some
>>>> options to use the scalable version of PtAP?  If so, it would be nice to
>>>> use the scalable version by default.  I am totally missing something here.
>>>> >
>>>> > Thanks,
>>>> >
>>>> > Fande
>>>> >
>>>> >
>>>> > Karl had a student in the summer who improved MatPtAP(). Do you use
>>>> the latest version of petsc?
>>>> > HYPRE may use less memory than PETSc because it does not save and
>>>> reuse the matrices.
>>>> >
>>>> > I do not understand why generating coarse matrix fails due to out of
>>>> memory. Do you use direct solver at coarse grid?
>>>> > Hong
>>>> >
>>>> > Based on above observation, I set the default PtAP algorithm as
>>>> 'nonscalable'.
>>>> > When PN > local estimated nonzero of C=PtAP, then switch default to
>>>> 'scalable'.
>>>> > User can overwrite default.
>>>> >
>>>> > For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>>>> > MatPtAP   3.6224e+01 (nonscalable for small mats,
>>>> scalable for larger ones)
>>>> > scalable MatPtAP 4.6129e+01
>>>> > hypre1.9389e+02
>>>> >
>>>> > This work in on petsc-master. Give it a try. If you encounter any
>>>> problem, let me know.
>>>> >
>>>> > Hong
>>>> >
>>>> > On Wed, May 3, 2017 at 10:01 AM, Mark Adams  wrote:
>>>> > (Hong), what is the current state of optimizing RAP for scaling?
>>>> >
>>>> > Nate, is driving 3D elasticity problems at scaling with GAMG and we
>>>> are working out performance problems. They are hitting problems at ~1.5B
>>>> dof problems on a basic Cray (XC30 I think).
>>>> >
>>>> > Thanks,
>>>> > Mark
>>>> >
>>>>
>>>>


Re: [petsc-users] GAMG scaling

2018-12-20 Thread Fande Kong via petsc-users
Thanks, Hong,

I just briefly went through the code. I was wondering if it is possible to
destroy "c->ptap" (that caches a lot of intermediate data) to release the
memory after the coarse matrix is assembled. I understand you may still
want to reuse these data structures by default but for my simulation, the
preconditioner is fixed and there is no reason to keep the "c->ptap".

It would be great, if we could have this optional functionality.

Fande Kong,

On Thu, Dec 20, 2018 at 9:45 PM Zhang, Hong  wrote:

> We use nonscalable implementation as default, and switch to scalable for
> matrices over finer grids. You may use option '-matptap_via scalable' to
> force scalable PtAP  implementation for all PtAP. Let me know if it works.
> Hong
>
> On Thu, Dec 20, 2018 at 8:16 PM Smith, Barry F. 
> wrote:
>
>>
>>   See MatPtAP_MPIAIJ_MPIAIJ(). It switches to scalable automatically for
>> "large" problems, which is determined by some heuristic.
>>
>>Barry
>>
>>
>> > On Dec 20, 2018, at 6:46 PM, Fande Kong via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>> >
>> >
>> >
>> > On Thu, Dec 20, 2018 at 4:43 PM Zhang, Hong  wrote:
>> > Fande:
>> > Hong,
>> > Thanks for your improvements on PtAP that is critical for MG-type
>> algorithms.
>> >
>> > On Wed, May 3, 2017 at 10:17 AM Hong  wrote:
>> > Mark,
>> > Below is the copy of my email sent to you on Feb 27:
>> >
>> > I implemented scalable MatPtAP and did comparisons of three
>> implementations using ex56.c on alcf cetus machine (this machine has small
>> memory, 1GB/core):
>> > - nonscalable PtAP: use an array of length PN to do dense axpy
>> > - scalable PtAP:   do sparse axpy without use of PN array
>> >
>> > What PN means here?
>> > Global number of columns of P.
>> >
>> > - hypre PtAP.
>> >
>> > The results are attached. Summary:
>> > - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
>> > - scalable PtAP is 4x faster than hypre PtAP
>> > - hypre uses less memory (see job.ne399.n63.np1000.sh)
>> >
>> > I was wondering how much more memory PETSc PtAP uses than hypre? I am
>> implementing an AMG algorithm based on PETSc right now, and it is working
>> well. But we find some a bottleneck with PtAP. For the same P and A, PETSc
>> PtAP fails to generate a coarse matrix due to out of memory, while hypre
>> still can generates the coarse matrix.
>> >
>> > I do not want to just use the HYPRE one because we had to duplicate
>> matrices if I used HYPRE PtAP.
>> >
>> > It would be nice if you guys already have done some compassions on
>> these implementations for the memory usage.
>> > Do you encounter memory issue with  scalable PtAP?
>> >
>> > By default do we use the scalable PtAP?? Do we have to specify some
>> options to use the scalable version of PtAP?  If so, it would be nice to
>> use the scalable version by default.  I am totally missing something here.
>> >
>> > Thanks,
>> >
>> > Fande
>> >
>> >
>> > Karl had a student in the summer who improved MatPtAP(). Do you use the
>> latest version of petsc?
>> > HYPRE may use less memory than PETSc because it does not save and reuse
>> the matrices.
>> >
>> > I do not understand why generating coarse matrix fails due to out of
>> memory. Do you use direct solver at coarse grid?
>> > Hong
>> >
>> > Based on above observation, I set the default PtAP algorithm as
>> 'nonscalable'.
>> > When PN > local estimated nonzero of C=PtAP, then switch default to
>> 'scalable'.
>> > User can overwrite default.
>> >
>> > For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>> > MatPtAP   3.6224e+01 (nonscalable for small mats,
>> scalable for larger ones)
>> > scalable MatPtAP 4.6129e+01
>> > hypre1.9389e+02
>> >
>> > This work in on petsc-master. Give it a try. If you encounter any
>> problem, let me know.
>> >
>> > Hong
>> >
>> > On Wed, May 3, 2017 at 10:01 AM, Mark Adams  wrote:
>> > (Hong), what is the current state of optimizing RAP for scaling?
>> >
>> > Nate, is driving 3D elasticity problems at scaling with GAMG and we are
>> working out performance problems. They are hitting problems at ~1.5B dof
>> problems on a basic Cray (XC30 I think).
>> >
>> > Thanks,
>> > Mark
>> >
>>
>>


Re: [petsc-users] Increasing norm with finer mesh

2018-10-16 Thread Fande Kong
Use -ksp_view to confirm the options are actually set.

Fande 

Sent from my iPhone

> On Oct 16, 2018, at 7:40 PM, Ellen M. Price  
> wrote:
> 
> Maybe a stupid suggestion, but sometimes I forget to call the
> *SetFromOptions function on my object, and then get confused when
> changing the options has no effect. Just a thought from a fellow grad
> student.
> 
> Ellen
> 
> 
>> On 10/16/2018 09:36 PM, Matthew Knepley wrote:
>> On Tue, Oct 16, 2018 at 9:14 PM Weizhuo Wang > > wrote:
>> 
>>I just tried both, neither of them make a difference. I got exactly
>>the same curve with either combination.
>> 
>> 
>> I have a hard time believing you. If you make the residual tolerance
>> much finer, your error will definitely change.
>> I run tests every day that do exactly this. You can run them too, since
>> they are just examples.
>> 
>>   Thanks,
>> 
>>  Matt
>>  
>> 
>>Thanks!
>> 
>>Wang weizhuo
>> 
>>On Tue, Oct 16, 2018 at 8:06 PM Matthew Knepley >> wrote:
>> 
>>On Tue, Oct 16, 2018 at 7:26 PM Weizhuo Wang
>>mailto:weizh...@illinois.edu>> wrote:
>> 
>>Hello again!
>> 
>>After some tweaking the code is giving right answers now.
>>However it start to disagree with MATLAB results
>>('traditional' way using matrix inverse) when the grid is
>>larger than 100*100. My PhD advisor and I suspects that the
>>default dimension of the Krylov subspace is 100 in the test
>>case we are running. If so, is there a way to increase the
>>size of the subspace?
>> 
>> 
>>1) The default subspace size is 30, not 100. You can increase
>>the subspace size using
>> 
>>   -ksp_gmres_restart n
>> 
>>2) The problem is likely your tolerance. The default solver
>>tolerance is 1e-5. You can change it using
>> 
>>   -ksp_rtol 1e-9
>> 
>>  Thanks,
>> 
>> Matt
>> 
>> 
>> 
>>Disagrees.png
>> 
>>Thanks!
>> 
>>Wang Weizhuo
>> 
>>On Tue, Oct 9, 2018 at 2:50 AM Mark Adams >> wrote:
>> 
>>To reiterate what Matt is saying, you seem to have the
>>exact solution on a 10x10 grid. That makes no sense
>>unless the solution can be represented exactly by your
>>FE space (eg, u(x,y) = x + y).
>> 
>>On Mon, Oct 8, 2018 at 9:33 PM Matthew Knepley
>>mailto:knep...@gmail.com>> wrote:
>> 
>>On Mon, Oct 8, 2018 at 9:28 PM Weizhuo Wang
>>>> wrote:
>> 
>>The code is attached in case anyone wants to
>>take a look, I will try the high frequency
>>scenario later.
>> 
>> 
>>That is not the error. It is superconvergence at the
>>vertices. The real solution is trigonometric, so your
>>linear interpolants or whatever you use is not going
>>to get the right value in between mesh points. You
>>need to do a real integral over the whole interval
>>to get the L_2 error.
>> 
>>  Thanks,
>> 
>> Matt
>> 
>> 
>>On Mon, Oct 8, 2018 at 7:58 PM Mark Adams
>>mailto:mfad...@lbl.gov>> wrote:
>> 
>> 
>> 
>>On Mon, Oct 8, 2018 at 6:58 PM Weizhuo Wang
>>>> wrote:
>> 
>>The first plot is the norm with the flag
>>-pc_type lu with respect to number of
>>grids in one axis (n), and the second
>>plot is the norm without the flag
>>-pc_type lu. 
>> 
>> 
>>So you are using the default PC w/o LU. The
>>default is ILU. This will reduce high
>>frequency effectively but is not effective
>>on the low frequency error. Don't expect
>>your algebraic error reduction to be at the
>>same scale as the residual reduction (what
>>KSP measures). 
>> 
>> 
>> 
>> 
>>-- 
>>Wang Weizhuo
>> 
>> 
>> 
>>-- 
>>What most experimenters take for granted before they
>>begin their experiments is infinitely more
>>interesting than any results to which their
>>  

Re: [petsc-users] UCX ERROR KNEM inline copy failed

2018-10-02 Thread Fande Kong
The error messages may have nothing to do with PETSc and MOOSE.

It might be from a package for MPI communication
https://github.com/openucx/ucx.  I have no experiences on such things. It
may be helpful to contact your HPC administer.

Thanks,

Fande,

On Tue, Oct 2, 2018 at 9:24 AM Matthew Knepley  wrote:

> On Tue, Oct 2, 2018 at 11:16 AM Y. Yang <
> yangyiwei.y...@mfm.tu-darmstadt.de> wrote:
>
>> Dear PETSc team
>>
>> Recently I'm using MOOSE (http://www.mooseframework.org/) which is built
>> with PETSc and, Unfortunately, I encountered some problems with
>> following PETSc options:
>>
>
> I do not know what problem you are reporting.I don't know what package
> knem_ep.c is
> part of, but its not PETSc.
>
>   Thanks,
>
>  Matt
>
>
>> petsc_options_iname = '-pc_type -ksp_gmres_restart -sub_ksp_type
>> -sub_pc_type -pc_asm_overlap -pc_factor_mat_solver_package'
>>
>> petsc_options_value = 'asm  1201  preonly ilu
>> 4superlu_dist'
>>
>>
>> the error message is:
>>
>> Time Step 1, time = 1
>>  dt = 1
>>
>>  |residual|_2 of individual variables:
>>  c:   779.034
>>  w:   0
>>  T:   6.57948e+07
>>  gr0: 211.617
>>  gr1: 206.973
>>  gr2: 209.382
>>  gr3: 191.089
>>  gr4: 185.242
>>  gr5: 157.361
>>  gr6: 128.473
>>  gr7: 87.6029
>>
>>   0 Nonlinear |R| =  [32m6.579482e+07 [39m
>> [1538482623.976180] [hpb0085:22501:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482605.111342] [hpb0085:22502:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.761138] [hpb0085:22502:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482607.107478] [hpb0085:22502:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482605.882817] [hpb0085:22503:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482607.133543] [hpb0085:22503:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482621.905475] [hpb0085:22510:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482626.531234] [hpb0085:22510:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482627.613343] [hpb0085:22515:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482627.830489] [hpb0085:22515:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482629.852351] [hpb0085:22515:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482630.194620] [hpb0085:22515:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482630.280636] [hpb0085:22515:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482600.219314] [hpb0085:22516:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.960350] [hpb0085:22516:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482622.949471] [hpb0085:22517:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482612.502017] [hpb0085:22500:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482613.231970] [hpb0085:22500:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482621.417530] [hpb0085:22520:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482622.020998] [hpb0085:22520:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.221292] [hpb0085:22521:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.676987] [hpb0085:22521:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.896865] [hpb0085:22521:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482639.611427] [hpb0085:22522:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482631.435277] [hpb0085:22523:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.278343] [hpb0085:22512:0]knem_ep.c:84   UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.396945] [hpb0085:22512:0]knem_ep.c:84   UCX ERROR
>> 

Re: [petsc-users] MatPtAPNumeric_MPIAIJ_MPIAIJ_scalable

2018-10-01 Thread Fande Kong
Thanks, Jed. I figured out. I just simply made P more sparse, then the
product would not take too much memory.

Fande,

On Fri, Sep 28, 2018 at 9:59 PM Jed Brown  wrote:

> It depends entirely on your matrices.  For example, if A is an arrowhead
> matrix (graph of a star -- one hub and many leaves) then A^2 is dense.
> If you have particular stencils for A and P, then we could tell you the
> fill ratio.
>
> Fande Kong  writes:
>
> > Hi All,
> >
> > I was wondering how much memory is required to get PtAP done? Do you have
> > any simple formula to this? So that I can have an  estimate.
> >
> >
> > Fande,
> >
> >
> > [132]PETSC ERROR: - Error Message
> > --
> > [132]PETSC ERROR: Out of memory. This could be due to allocating
> > [132]PETSC ERROR: too large an object or bleeding by not properly
> > [132]PETSC ERROR: destroying unneeded objects.
> > [132]PETSC ERROR: Memory allocated 0 Memory used by process 3249920
> > [132]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> > [132]PETSC ERROR: Memory requested 89148704
> > [132]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html
> > for trouble shooting.
> > [132]PETSC ERROR: Petsc Release Version 3.9.4, unknown
> > [132]PETSC ERROR: ../../rattlesnake-opt on a arch-theta-avx512-64-opt
> named
> > nid03830 by fdkong Fri Sep 28 22:43:45 2018
> > [132]PETSC ERROR: Configure options --LIBS=-lstdc++
> > --known-64-bit-blas-indices=0 --known-bits-per-byte=8
> > --known-has-attribute-aligned=1 --known-level1-dcache-assoc=8
> > --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768
> > --known-memcmp-ok=1 --known-mklspblas-supports-zero-based=0
> > --known-mpi-c-double-complex=1 --known-mpi-int64_t=1
> > --known-mpi-long-double=1 --known-mpi-shared-libraries=0
> > --known-sdot-returns-double=0 --known-sizeof-MPI_Comm=4
> > --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8
> > --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8
> > --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8
> > --known-sizeof-void-p=8 --known-snrm2-returns-double=0 --with-batch=1
> > --with-blaslapack-lib="-mkl
> > -L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib/intel64"
> > --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC
> > --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn
> > --with-fortranlib-autodetect=0 --with-hdf5=0 --with-memalign=64
> > --with-mpiexec=aprun --with-shared-libraries=0 --download-metis=1
> > --download-parmetis=1 --download-superlu_dist=1 --download-hypre=1
> > --download-ptscotch=1 COPTFLAGS="-O3 -xMIC-AVX512" CXXOPTFLAGS="-O3
> > -xMIC-AVX512" FOPTFLAGS="-O3 -xMIC-AVX512"
> > PETSC_ARCH=arch-theta-avx512-64-opt --with-64-bit-indices=1
> > [132]PETSC ERROR: #1 PetscSegBufferCreate() line 64 in
> > /gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
> > [132]PETSC ERROR: #2 PetscSegBufferCreate() line 64 in
> > /gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
> > [132]PETSC ERROR: #3 PetscSegBufferExtractInPlace() line 227 in
> > /gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
> > [132]PETSC ERROR: #4 MatStashScatterBegin_BTS() line 854 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/utils/matstash.c
> > [132]PETSC ERROR: #5 MatStashScatterBegin_Private() line 461 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/utils/matstash.c
> > [132]PETSC ERROR: #6 MatAssemblyBegin_MPIAIJ() line 683 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > [132]PETSC ERROR: #7 MatAssemblyBegin() line 5158 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
> > [132]PETSC ERROR: #8 MatPtAPNumeric_MPIAIJ_MPIAIJ_scalable() line 262 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiptap.c
> > [132]PETSC ERROR: #9 MatPtAP_MPIAIJ_MPIAIJ() line 172 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiptap.c
> > [132]PETSC ERROR: #10 MatPtAP() line 9182 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
> > [132]PETSC ERROR: #11 MatGalerkin() line 10615 in
> > /gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
> > [132]PETSC ERROR: #12 PCSetUp_MG() line 730 in
> > /gpfs/mira-home/fdkong/petsc/src/ksp/pc/impls/mg/mg.c
> > [132]PETSC ERROR: #13 PCSetUp_HMG() line 336 in
> > /gpfs/mira-home/fdkong/petsc/src/ksp/pc/impls/hmg/hmg.c
> > [132]PETSC ERROR: #14 PCSetUp() line 923 in
> > /gpfs/mira-home/fdkong/petsc/src/ksp/pc/interface/precon.c
> > [132]PETSC ERROR: #15 KSPSetUp() line 381 in
> > /gpfs/mira-home/fdkong/petsc/src/ksp/ksp/interface/itfunc.c
> > [136]PETSC ERROR: - Error Message
> > --
>


[petsc-users] MatPtAPNumeric_MPIAIJ_MPIAIJ_scalable

2018-09-28 Thread Fande Kong
Hi All,

I was wondering how much memory is required to get PtAP done? Do you have
any simple formula to this? So that I can have an  estimate.


Fande,


[132]PETSC ERROR: - Error Message
--
[132]PETSC ERROR: Out of memory. This could be due to allocating
[132]PETSC ERROR: too large an object or bleeding by not properly
[132]PETSC ERROR: destroying unneeded objects.
[132]PETSC ERROR: Memory allocated 0 Memory used by process 3249920
[132]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[132]PETSC ERROR: Memory requested 89148704
[132]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[132]PETSC ERROR: Petsc Release Version 3.9.4, unknown
[132]PETSC ERROR: ../../rattlesnake-opt on a arch-theta-avx512-64-opt named
nid03830 by fdkong Fri Sep 28 22:43:45 2018
[132]PETSC ERROR: Configure options --LIBS=-lstdc++
--known-64-bit-blas-indices=0 --known-bits-per-byte=8
--known-has-attribute-aligned=1 --known-level1-dcache-assoc=8
--known-level1-dcache-linesize=64 --known-level1-dcache-size=32768
--known-memcmp-ok=1 --known-mklspblas-supports-zero-based=0
--known-mpi-c-double-complex=1 --known-mpi-int64_t=1
--known-mpi-long-double=1 --known-mpi-shared-libraries=0
--known-sdot-returns-double=0 --known-sizeof-MPI_Comm=4
--known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8
--known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8
--known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8
--known-sizeof-void-p=8 --known-snrm2-returns-double=0 --with-batch=1
--with-blaslapack-lib="-mkl
-L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib/intel64"
--with-cc=cc --with-clib-autodetect=0 --with-cxx=CC
--with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn
--with-fortranlib-autodetect=0 --with-hdf5=0 --with-memalign=64
--with-mpiexec=aprun --with-shared-libraries=0 --download-metis=1
--download-parmetis=1 --download-superlu_dist=1 --download-hypre=1
--download-ptscotch=1 COPTFLAGS="-O3 -xMIC-AVX512" CXXOPTFLAGS="-O3
-xMIC-AVX512" FOPTFLAGS="-O3 -xMIC-AVX512"
PETSC_ARCH=arch-theta-avx512-64-opt --with-64-bit-indices=1
[132]PETSC ERROR: #1 PetscSegBufferCreate() line 64 in
/gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
[132]PETSC ERROR: #2 PetscSegBufferCreate() line 64 in
/gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
[132]PETSC ERROR: #3 PetscSegBufferExtractInPlace() line 227 in
/gpfs/mira-home/fdkong/petsc/src/sys/utils/segbuffer.c
[132]PETSC ERROR: #4 MatStashScatterBegin_BTS() line 854 in
/gpfs/mira-home/fdkong/petsc/src/mat/utils/matstash.c
[132]PETSC ERROR: #5 MatStashScatterBegin_Private() line 461 in
/gpfs/mira-home/fdkong/petsc/src/mat/utils/matstash.c
[132]PETSC ERROR: #6 MatAssemblyBegin_MPIAIJ() line 683 in
/gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiaij.c
[132]PETSC ERROR: #7 MatAssemblyBegin() line 5158 in
/gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
[132]PETSC ERROR: #8 MatPtAPNumeric_MPIAIJ_MPIAIJ_scalable() line 262 in
/gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiptap.c
[132]PETSC ERROR: #9 MatPtAP_MPIAIJ_MPIAIJ() line 172 in
/gpfs/mira-home/fdkong/petsc/src/mat/impls/aij/mpi/mpiptap.c
[132]PETSC ERROR: #10 MatPtAP() line 9182 in
/gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
[132]PETSC ERROR: #11 MatGalerkin() line 10615 in
/gpfs/mira-home/fdkong/petsc/src/mat/interface/matrix.c
[132]PETSC ERROR: #12 PCSetUp_MG() line 730 in
/gpfs/mira-home/fdkong/petsc/src/ksp/pc/impls/mg/mg.c
[132]PETSC ERROR: #13 PCSetUp_HMG() line 336 in
/gpfs/mira-home/fdkong/petsc/src/ksp/pc/impls/hmg/hmg.c
[132]PETSC ERROR: #14 PCSetUp() line 923 in
/gpfs/mira-home/fdkong/petsc/src/ksp/pc/interface/precon.c
[132]PETSC ERROR: #15 KSPSetUp() line 381 in
/gpfs/mira-home/fdkong/petsc/src/ksp/ksp/interface/itfunc.c
[136]PETSC ERROR: - Error Message
--


Re: [petsc-users] Implementation of Power Iteration method in PETSc

2018-09-27 Thread Fande Kong
Hi Yingjie,

For finite difference method, there are a lot of example in PETSc. For
instance,
https://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex29.c.html

For linear eigenvalue problems,  there is a list of examples at
http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/index.html

Before you go to implement your inver power algorithm, I would like to
suggest you go though PETSc manual and SLEPc manual.
https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf  and
http://slepc.upv.es/documentation/slepc.pdf
SLEPc has both nonlinear and linear inverse power iteration methods, and
you can pick any version you want.

It is a good idea to have big picture before you do anything using PETSc or
SLEPc.

Fande,

On Wed, Sep 26, 2018 at 8:17 PM Yingjie Wu  wrote:

> Thank you, Fande.
>
> I've seen you in moose usergroup before.
>
>
> I've just learned about SLEPC, and I wonder if I want to do neutron
> eigenvalue calculations and use finite difference to discrete grids, would
> it be difficult to implement it in SLEPC? Is there such an example(finite
> difference + linear eigenvalue)?Because I don't know much about finite
> element.Or am I still using a loop of KSP in PETSc?I'm a newcomer to petsc,
> please give me some advice
>
>
> Thanks,
>
> Yingjie
>
>
> Fande Kong  于2018年9月27日周四 上午12:25写道:
>
>> I have implemented this algorithm in SLEPC. Take a look at this example
>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex34.c.html
>>
>> The motivation of the algorithm is also for neutron calculations (a
>> moose-based application).
>>
>> Fande,
>>
>> On Wed, Sep 26, 2018 at 10:02 AM Yingjie Wu  wrote:
>>
>>> Dear Petsc developer:
>>> Hi,
>>>
>>> Thank you very much for your previous reply,  they really helped me a
>>> lot.
>>>
>>> I've been doing neutron calculations recently, and I want to use the
>>> power Iteration method to calculate the neutron flux, essentially to solve
>>> linear equations, but after each solution is done, a scalar *K* is
>>> updated to form a new right end term *b*, and the next linear equation
>>> system is solved until the convergence criterion is satisfied. The flow of
>>> the source iteration algorithm is as follows:
>>>
>>>
>>> 1: *Φ*(0) = arbitrary nonzero vector 2: K(0) = arbitrary nonzero
>>> constant 3: *for* n = 1; 2; 3; ::: do 4: *b *= 1 / K(n-1) *F * Φ(n-1)
>>> 5: *Φ*(n) * M   = *b* 6: K(n) = k(n-1)* (F * *Φ*(n)  ,F *Φ*(n)  ) / (F
>>> *Φ*(n-1)  ,F  *Φ*(n-1)  )
>>>
>>> 7: check convergence of eigenvalue and eigenvector 8: *end for*
>>>
>>>
>>> (F , M are the coefficient matrix.)
>>>
>>>
>>> The difficulty is that I need to set up an extra loop to regulate KSP.
>>> Does PETSc have such an example? How do I implement this algorithm in
>>> PETsc? Please tell me the key functions if possible.
>>>
>>>
>>> Thanks,
>>>
>>> Yingjie
>>>
>>>
>>>


Re: [petsc-users] Implementation of Power Iteration method in PETSc

2018-09-26 Thread Fande Kong
I have implemented this algorithm in SLEPC. Take a look at this example
http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex34.c.html

The motivation of the algorithm is also for neutron calculations (a
moose-based application).

Fande,

On Wed, Sep 26, 2018 at 10:02 AM Yingjie Wu  wrote:

> Dear Petsc developer:
> Hi,
>
> Thank you very much for your previous reply,  they really helped me a lot.
>
> I've been doing neutron calculations recently, and I want to use the power
> Iteration method to calculate the neutron flux, essentially to solve linear
> equations, but after each solution is done, a scalar *K* is updated to
> form a new right end term *b*, and the next linear equation system is
> solved until the convergence criterion is satisfied. The flow of the source
> iteration algorithm is as follows:
>
>
> 1: *Φ*(0) = arbitrary nonzero vector 2: K(0) = arbitrary nonzero constant
> 3: *for* n = 1; 2; 3; ::: do 4: *b *= 1 / K(n-1) *F * Φ(n-1)   5: *Φ*(n)
> * M   = *b* 6: K(n) = k(n-1)* (F * *Φ*(n)  ,F *Φ*(n)  ) / (F  *Φ*(n-1)  ,F
> *Φ*(n-1)  )
>
> 7: check convergence of eigenvalue and eigenvector 8: *end for*
>
>
> (F , M are the coefficient matrix.)
>
>
> The difficulty is that I need to set up an extra loop to regulate KSP.
> Does PETSc have such an example? How do I implement this algorithm in
> PETsc? Please tell me the key functions if possible.
>
>
> Thanks,
>
> Yingjie
>
>
>


Re: [petsc-users] Error 404 for mhypre.c.html

2018-09-13 Thread Fande Kong
Please look at the page
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatHYPRESetPreallocation.html


and then click on

src/mat/impls/aij/hypre/mhypre.c

(right after ``Location")

Fande,

On Thu, Sep 13, 2018 at 9:37 AM Satish Balay  wrote:

> balay@asterix /home/balay/petsc (maint=)
> $ git ls-files |grep mhypre.c
> src/mat/impls/hypre/mhypre.c
>
> So URL should be:
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/hypre/mhypre.c.html#MatHYPRESetPreallocation
>
> Satish
>
> On Thu, 13 Sep 2018, Fande Kong wrote:
>
> >
> http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/aij/hypre/mhypre.c.html#MatHYPRESetPreallocation
> >
> >
> > Fande
> >
>
>


[petsc-users] Error 404 for mhypre.c.html

2018-09-13 Thread Fande Kong
http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/aij/hypre/mhypre.c.html#MatHYPRESetPreallocation


Fande


Re: [petsc-users] FIELDSPLIT fields

2018-09-05 Thread Fande Kong
On Wed, Sep 5, 2018 at 9:54 AM Smith, Barry F.  wrote:

>
>   2 should belong to one of the subdomains, either one is fine.
>
>Barry
>
>
> > On Sep 5, 2018, at 10:46 AM, Rossi, Simone  wrote:
> >
> > I’m trying to setup GASM, but I’m probably misunderstanding something.
> >
> > If I have this mesh
> >
> > 0 —— 1 —— 2 —— 3 —— 4
> > subdomain 1  |   subdomain 2
> >
>

You may need to make a decision which subdomain ``2" belongs to. Most
people just let the shared node go to the lower MPI rank. If so, in this
example, ``2" belongs to the subdomain one.

iis1 = {0, 1, 2}
ois  = {0, 1, 2, 3}

iis2 = {3, 4}
ois2 = {2, 3, 4}

You consider seriously to use GASM, I would suggest to partition  your
problem (using ``hierach") in such a way that multi-rank subdomain is
actually connected, otherwise you may end up having a deficient
performance.


Thanks,

Fande,




> > I create an interior (no overlap) and an outer (with overlap) IS for
> both subdomains.
> >
> > In my naive understanding
> >
> > iis1  = {0, 1}
> > ois1 = {0, 1, 2}
> >
> > and
> >
> > iis2  = {3, 4}
> > ois2 = {2, 3, 4}
> >
> > but then the node at the interface (node 2) does not belong to any
> interior IS. Should node 2 belong to both interior IS? Or should it belong
> only to one of the domains?
> >
> > Thanks,
> > Simone
> >
> >
> > On Aug 15, 2018, at 22:11, Griffith, Boyce Eugene 
> wrote:
> >
> >>
> >>
> >>> On Aug 15, 2018, at 10:07 PM, Smith, Barry F. 
> wrote:
> >>>
> >>>
> >>> Yes you can have "overlapping fields" with FIELDSPLIT but I don't
> think you can use FIELDSPLIT for your case. You seem to have a geometric
> decomposition into regions. ASM and GASM are intended for such
> decompositions. Fieldsplit is for multiple fields that each live across the
> entire domain.
> >>
> >> Basically there is one field the lives on the entire domain, and
> another field that lives only on a subdomain.
> >>
> >> Perhaps we could do GASM for the geometric split and FIELDSPLIT within
> the subdomain with the two fields.
> >>
> >>> Barry
> >>>
> >>>
>  On Aug 15, 2018, at 7:42 PM, Griffith, Boyce Eugene <
> boy...@email.unc.edu> wrote:
> 
>  Is it permissible to have overlapping fields in FIELDSPLIT? We are
> specifically thinking about how to handle DOFs living on the interface
> between two regions.
> 
>  Thanks!
> 
>  — Boyce
> >>>
> >>
> >
>
>


Re: [petsc-users] Problem with SNES convergence

2018-08-30 Thread Fande Kong
Hi Barry,

I haven't had time to look into TS so far. But it is definitely
interesting. One simple question would like this: If I have a  simple loop
for time steppers, and each step SNES is called. How hard to convert my
code to use TS?

Any suggestion? Where should I start from?


Fande,

On Thu, Aug 30, 2018 at 11:29 AM Smith, Barry F.  wrote:

>
>Note also that the PETSc TS component has a large variety of
> timesteppers with automatic adaptivity which adjust the time-step for
> accuracy and convergence. Depending on the exact needs of your time-stepper
> it might be easier in the long run to use PETSc's time-steppers rather than
> write your own.
>
>Barry
>
>
> > On Aug 30, 2018, at 10:46 AM, Ling Zou  wrote:
> >
> > Rahul, please see the logic I used to reduce time step size.
> > Hope this helps you.
> >
> > Ling
> >
> > for (timestep = 1; timestep <= N; timestep++) // loop over time steps
> > {
> >   // before trying to solve a time step, 1) it is still not successful;
> 2) we are not giving up; 3) haven't failed yet
> >   bool give_up = false; bool success = false; bool
> experienced_fail_this_time_step = false;
> >   // save solutions and old solutions into u_copy and u_old_copy, in
> case a time step fails and we restart from this saved solutions
> >   VecCopy(u, u_copy); VecCopy(u_old, u_old_copy);
> >
> >   while ((!give_up) && (!success)) // as long as not successful and not
> giving up, we solve again with smaller time step
> >   {
> > if (time_step_size < dt_min) { give_up = true; break; } // ok, bad
> luck, give up due to time step size smaller than a preset value
> > if (experienced_fail_this_time_step) { // get the vectors from
> backups if this is a re-try, i.e., already failed with a larger time step
> >   VecCopy(u_old_copy, u); VecCopy(u_old_copy, u_old);
> > }
> >
> > try {
> >   SNESSolve(snes, NULL, u);
> >   SNESGetConvergedReason(snes, _converged_reason);
> >
> >   if (snes_converged_reason > 0)  success = true; // yes, snes
> converged
> >   else { // no, snes did not converge
> > cutTimeStepSize(); // e.g., dt / 2
> > experienced_fail_this_time_step = true;
> >   }
> > }
> > catch (int err) { // in case your own pieces of code throws an
> exception
> >   std::cout << "An exception occurred." << std::endl;
> >   success = false;
> >   cutTimeStepSize(); // e.g., dt / 2
> >   experienced_fail_this_time_step = true;
> > }
> >   }
> >
> >   if (success) {
> > // output, print, whatever
> > // duplicate current solution to old solution in preparing next time
> step
> > VecCopy(u, u_old);
> > // you can increase time step size here, e.g. * 2
> > increaseTimeStepSize();
> >   }
> >
> >   if (give_up) {
> > simulationFailed = true;
> > std::cerr << "Simulation failed.\n";
> > //exit(1);// dont exit(1) now, just break the for-loop, let PETSc
> clean its workspace.
> > break;
> >   }
> > }
> >
> > From: Rahul Samala 
> > Sent: Wednesday, August 29, 2018 10:37:30 PM
> > To: Ling Zou; Smith, Barry F.
> > Cc: PETSc Users List
> > Subject: Re: [petsc-users] Problem with SNES convergence
> >
> > Thank you Ling, I would definitely like to look at your code for
> reducing timestep size.
> > Thank you Barry for your inputs.
> >
> > --
> > Rahul.
> >
> > On Wednesday, August 29, 2018, 9:02:00 PM GMT+5:30, Smith, Barry F. <
> bsm...@mcs.anl.gov> wrote:
> >
> >
> >
> > Current time (before start of timestep) 52.5048,iter=5380
>  Timestep=864.00
> >   0 SNES Function norm 1.650467412595e+05
> > 0 KSP preconditioned resid norm 3.979123221160e+03 true resid norm
> 1.650467412595e+05 ||r(i)||/||b|| 1.e+00
> > 1 KSP preconditioned resid norm 9.178246525982e-11 true resid norm
> 7.006473307032e-09 ||r(i)||/||b|| 4.245144892632e-14
> >   Linear solve converged due to CONVERGED_RTOL iterations 1
> >   1 SNES Function norm 6.722712947273e+02
> >   Linear solve did not converge due to DIVERGED_NANORINF iterations 0
> > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations
> 1
> >
> > This usually is an indicator that the LU (ILU) factorization has hit
> a zero pivot (hence the linear solve has a divide by zero so gives the
> DIVERGED_NANORINF flag).
> >
> > You can/should call SNESGetConvergedReason() immediately after each
> SNESSolve(), if the result is negative that means something has failed in
> the nonlinear solve and you can try cutting the time-step and trying again.
> >
> > Good luck,
> >
> > Barry
> >
> >
> > > On Aug 29, 2018, at 10:11 AM, Ling Zou  wrote:
> > >
> > > 1) My experience is that this kind of bug or sudden death (everything
> is fine till suddenly something is broken) is very difficult to debug/fix.
> I looked at your txt files and could not give any quick comments. Maybe
> PETSc developers have better idea on this.
> > > 2) I do have successful experience on reducing time step size when
> PETSc 

Re: [petsc-users] Whether Petsc will deleted zero entries during the solve

2018-06-14 Thread Fande Kong
This may help in this situation.
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatResetPreallocation.html


Fande,

On Thu, Jun 14, 2018 at 1:08 PM, Smith, Barry F.  wrote:

>
>I am guessing your matrix has an "envelope" of nonzero values but the
> first time you fill the matrix you do not fill up the entire envelope?
> Hence internally within the matrix we squeeze out those nonexistent
> locations so next time you fill the matrix the new location you need is not
> available.
>
>There is really only one way to deal with this. Initially build the
> entire envelope of values (putting zeros in certain locations is fine they
> won't get squuzed out) then for future calls the locations are already
> there and so you will have no problems with new nonzero.
>
>   On the other hand if the envelope is much much larger than any
> particular set of nonzero locations then it is better to create a new
> matrix each time because all the solves with the first approach treats all
> the matrix entries in the envelope as nonzero (even if they happen to be
> zero).
>
>Barry
>
>
> > On Jun 14, 2018, at 1:17 PM, Qicang Shen  wrote:
> >
> > Hi Guys,
> >
> > I'm now confronting a problem.
> >
> > I'm using PETSC to construct a SPARSE Matrix. And I'm sure that the
> matrix has been allocated correctly using  MatMPIAIJSetPreallocation with
> the upper limit of the size.
> >
> > The code works well when I just solve the system once.
> >
> > However, after the system been solved. And I want to use the original
> non-zero structure, but change the elements inside the matrix. The petsc
> will show the error message as:
> >
> > [14]PETSC ERROR: Argument out of range
> > [14]PETSC ERROR: New nonzero at (58,56) caused a malloc
> > Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to
> turn off this check
> >
> > My current the solution is destroy the system and reallocate with the
> same size. But I believe there should be more efficient way, i.e. just use
> the original structure.
> >
> > I would like search for help to confirm that whether PETSC will
> compress/delete the zero entries during the 
> MatAssemblyBegin/MatAssemblyEnd/KSPSolve,
> or other possible places.
> >
> > And how to avoid deleting zero elements during the process?
> >
> > Thanks very much.
> >
> >
> >
> >
> > Qicang SHEN
> > PhD Candidate
> > Nuclear Engineering and Radiological Sciences, University of Michigan,
> Ann Arbor
> > Email: qican...@umich.edu
>
>


[petsc-users] Fwd: Installing PETSC manually.

2018-06-06 Thread Fande Kong
Hi Satish,

A MOOSE user has troubles to build Metis that is "downloaded" from a local
directory. Do you have any idea?

Vi,

Could you share "configure.log" with PETSc team?


Thanks,

Fande,

-- Forwarded message --
From: Vi Ha 
Date: Wed, Jun 6, 2018 at 11:00 AM
Subject: Re: Installing PETSC manually.
To: moose-users 


Hi Jason,

Thanks for the reply.
I am having the same exact error as you linked. Here's the full error
message from configure.log:


===

Trying to download file:///home/vi/Documents/v-
install-moose/files/petsc-config-backup/v5.1.0-p4.tar.gz for METIS

==
=



  Downloading file:///home/vi/Documents/v-
install-moose/files/petsc-config-backup/v5.1.0-p4.tar.gz to
/home/vi/Documents/moose/packages/projects/src/petsc-3.
7.6/arch-linux2-c-opt/externalpackages/_d_v5.1.0-p4.tar.gz

  Extracting /home/vi/Documents/moose/
packages/projects/src/petsc-3.7.6/arch-linux2-c-opt/
externalpackages/_d_v5.1.0-p4.tar.gz

Executing: cd /home/vi/Documents/moose/packages/projects/src/petsc-3.
7.6/arch-linux2-c-opt/externalpackages; chmod -R a+r
petsc-pkg-metis-0adf3ea7785d;find  petsc-pkg-metis-0adf3ea7785d -type d
-name "*" -exec chmod a+rx {} \;

  Looking for METIS at git.metis, hg.metis or a directory
starting with ['metis']

  Could not locate an existing copy of METIS:

['fblaslapack-3.4.2', 'petsc-pkg-metis-0adf3ea7785d',
'hypre-2.11.1', 'scalapack-2.0.2', 'pkg-metis']

ERROR: Failed to download METIS

And then the on screen error becomes:

Unable to download METIS
Failed to download METIS

Same as previous post. It "downloads" the file but it cannot detect it for
some reason.



On Tuesday, June 5, 2018 at 3:08:17 PM UTC-4, jason.miller wrote:
>
> We ran into an issue like this recently. I believe this thread will help
> you:  MOOSE-Users: Error while installing PETSc
> 
>
> On Tue, Jun 5, 2018 at 1:34 PM, Vi Ha  wrote:
>
>> I am trying to install petsc using these instructions:
>> http://mooseframework.org/wiki/BasicManualInstallation/Linux/
>> on a linux system that isn't allowing me to download things directly.
>>
>> I am trying to install the dependent packages such as hypre, metis etc.
>> does anyone know if there are instructions on how to download and install
>> them manually? I know there are parameters in PETSC's configure file that
>> allow me to do so but I don't know what they are.
>>
>> Thank you.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "moose-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to moose-users...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/moose-users
>> 
>> .
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/moose-users/ae2cfd8e-ce5c-4525-990c-cdd524814c8c%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout
>> 
>> .
>>
>
> --
You received this message because you are subscribed to the Google Groups
"moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to moose-users+unsubscr...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
To view this discussion on the web visit https://groups.google.com/d/
msgid/moose-users/f2d4a0f1-3b81-4fe8-9acf-b1814ee6a290%40googlegroups.com

.

For more options, visit https://groups.google.com/d/optout.


Re: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation

2018-05-22 Thread Fande Kong
Hi Eric,

I am curious if the parallel symbolic factoriation is faster than
the sequential version? Do you have timing?


Fande,

On Tue, May 22, 2018 at 12:18 PM, Eric Chamberland <
eric.chamberl...@giref.ulaval.ca> wrote:

>
>
> On 22/05/18 02:03 PM, Smith, Barry F. wrote:
>
>>
>> Hmm, why would
>>
>> the resolution with *sequential* symbolic factorisation gives ans err
>>> around 1e-6 instead of 1e-16 for parallel one (when it works).
>>>
>>
>>? One would think that doing a "sequential" symbolic factorization
>> won't affect the answer to this huge amount? Perhaps this is the problem
>> that needs to be addressed.
>>
>>
> I do agree that this is a huge amount of difference... and if we agree
> this is also a bug, than it means there are not one but two bugs that
> deserve to be fixed...
>
> Thanks,
>
> Eric
>
>


Re: [petsc-users] Could not determine how to create a shared library!

2018-05-03 Thread Fande Kong
--with-blaslapack-lib=-mkl -L' + os.environ['MKLROOT'] + '/lib/intel64

works.

Fande,

On Thu, May 3, 2018 at 10:09 AM, Satish Balay  wrote:

> Ok you are not 'building blaslapack' - but using mkl [as per
> configure.log].
>
> I'll have to check the issue. It might be something to do with using
> mkl as a static library..
>
> Hong might have some suggestions wrt theta builds.
>
> Satish
>
> On Thu, 3 May 2018, Satish Balay wrote:
>
> > Perhaps you should use MKL on theta? Again check
> config/examples/arch-cray-xc40-knl-opt.py
> >
> > Satish
> >
> > On Thu, 3 May 2018, Kong, Fande wrote:
> >
> > > Thanks,
> > >
> > > I get the PETSc complied, but theta does not like the shared lib, I
> think.
> > >
> > > I am switching back to a static lib.   I ever successfully built and
> ran
> > > the PETSc with the static compiling.
> > >
> > > But I encountered a problem this time on building blaslapack.
> > >
> > >
> > > Thanks,
> > >
> > > Fande
> > >
> > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay 
> wrote:
> > >
> > > > This is theta..
> > > >
> > > > Try: using --LDFLAGS=-dynamic option
> > > >
> > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py]
> > > >
> > > > Satish
> > > >
> > > > On Tue, 1 May 2018, Kong, Fande wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I can build a static petsc library on a supercomputer, but could
> not do
> > > > the
> > > > > same thing with " --with-shared-libraries=1".
> > > > >
> > > > > The log file is attached.
> > > > >
> > > > >
> > > > > Fande,
> > > > >
> > > >
> > > >
> > >
> >
> >
>
>


Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Fande Kong
The default git gives me:

*Could not execute "['git', 'rev-parse', '--git-dir']"*

when I am configuring PETSc.

The manually loaded *gits*  work just fine.


Fande,

On Wed, Apr 4, 2018 at 5:04 PM, Garvey, Cormac T 
wrote:

> I though it was fixed, yes I will look into it again.
>
> Do you get an error just doing a git clone on falcon1 and falcon2?
>
> On Wed, Apr 4, 2018 at 4:48 PM, Kong, Fande  wrote:
>
>> module load git/2.16.2-GCCcore-5.4.0"  also works.
>>
>> Could you somehow make the default git work as well? Hence we do not need
>> to have this extra "module load for git"
>>
>> Fande,
>>
>> On Wed, Apr 4, 2018 at 4:43 PM, Kong, Fande  wrote:
>>
>>> Thanks, Cormac,
>>>
>>> *module load git/1.8.5.2-GCC-4.8.3 *
>>>
>>> works for me.
>>>
>>> Did not try "module load git/2.16.2-GCCcore-5.4.0" yet.
>>>
>>> I will try, and get it back here.
>>>
>>>
>>>
>>> Fande
>>>
>>> On Wed, Apr 4, 2018 at 4:39 PM, Garvey, Cormac T 
>>> wrote:
>>>

 We needed to rebuilt git on the INL falcon cluster because github
 server changed such that it no longer accepted TLSv1.

 The default git on the falcon cluster /usr/bin/git is just a wrapper
 script, so users would not need to load any modules to
 use git.

 When you load load git on falcon1 or falcon2 does it still fail?

 module load git/2.16.2-GCCcore-5.4.0

 Thanks,
 Cormac.



 On Wed, Apr 4, 2018 at 4:28 PM, Kong, Fande  wrote:

> Hi Cormac,
>
> Do you know anything on "git"? How did you guys build git on the
> falcon1?  The default git on Falcon1 does not work with petsc any more.
>
>
> Fande,
>
>
>
> On Wed, Apr 4, 2018 at 4:20 PM, Satish Balay 
> wrote:
>
>> Ok - I don't think I have access to this OS.
>>
>> And I see its from 2009 [sure its enterprise os - with regular
>> backport updates]
>>
>> But its wierd that you have such a new version of git at /usr/bin/git
>>
>> From what we know so far - the problem appears to be some bad
>> interaction of python-2.6 with this old OS [i.e old glibc etc..] - and
>> this new git version [binary built locally or on a different OS and
>> installed locally ?].
>>
>> Satish
>>
>> On Wed, 4 Apr 2018, Kong, Fande wrote:
>>
>> >  moose]$ uname -a
>> > Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40 UTC
>> 2017
>> > (294ccfe) x86_64 x86_64 x86_64 GNU/Linux
>> >
>> >
>> > moose]$ lsb_release -a
>> > LSB Version:
>> > core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86
>> _64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:deskto
>> p-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics
>> -3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
>> > Distributor ID:SUSE LINUX
>> > Description:SUSE Linux Enterprise Server 11 (x86_64)
>> > Release:11
>> > Codename:n/a
>> >
>> >
>> >
>> > On Wed, Apr 4, 2018 at 4:08 PM, Satish Balay 
>> wrote:
>> >
>> > > On Wed, 4 Apr 2018, Satish Balay wrote:
>> > >
>> > > > On Wed, 4 Apr 2018, Satish Balay wrote:
>> > > >
>> > > > > was your '2.16.2' version installed from source?
>> > > >
>> > > > >>>
>> > > > Checking for program /usr/bin/git...found
>> > > > Defined make macro "GIT" to "git"
>> > > > Executing: git --version
>> > > > stdout: git version 2.16.2
>> > > > 
>> > > >
>> > > > I gues its the OS default package
>> > > >
>> > > > >
>> > > > Machine platform:
>> > > > ('Linux', 'falcon1', '3.0.101-108.13-default', '#1 SMP Wed Oct
>> 11
>> > > 12:30:40 UTC 2017 (294ccfe)', 'x86_64', 'x86_64')
>> > > > Python version:
>> > > > 2.6.9 (unknown, Aug  5 2016, 11:15:31)
>> > > > [GCC 4.3.4 [gcc-4_3-branch revision 152973]]
>> > > > 
>> > > >
>> > > > What OS/version is on this machine? I can try reproducing in a
>> VM
>> > >
>> > > It is strange that the kernel is old [3.0 - perhaps LTS OS] ,
>> python is
>> > > old [2.6] - but git is new? [2.16?]
>> > >
>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__
>> > > mirrors.edge.kernel.org_pub_software_scm_git_=DwIBAg=
>> > > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
>> > > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=WF6ZxIh9Z9Hh2kYY0w70aNrgfPicp6
>> > > kgIh5BvezPiEY=KWMsO7XC0-pQuQ_mD03tNJWEpxSTATlZW_DmX0QofGw=
>> > > git-2.16.2.tar.gz  16-Feb-2018
>> 17:48
>> > > 7M
>> > >
>> > > Satish
>> > >
>> >
>>
>>
>


 --
 Cormac Garvey
 HPC Software Consultant
 Scientific Computing
 Idaho 

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Fande Kong
On Tue, Apr 3, 2018 at 9:12 AM, Stefano Zampini 
wrote:

>
> On Apr 3, 2018, at 4:58 PM, Satish Balay  wrote:
>
> On Tue, 3 Apr 2018, Kong, Fande wrote:
>
> On Tue, Apr 3, 2018 at 1:17 AM, Smith, Barry F. 
> wrote:
>
>
>   Each external package definitely needs its own duplicated communicator;
> cannot share between packages.
>
>   The only problem with the dups below is if they are in a loop and get
> called many times.
>
>
>
> The "standard test" that has this issue actually has 1K fields. MOOSE
> creates its own field-split preconditioner (not based on the PETSc
> fieldsplit), and each filed is associated with one PC HYPRE.  If PETSc
> duplicates communicators, we should easily reach the limit 2048.
>
> I also want to confirm what extra communicators are introduced in the bad
> commit.
>
>
> To me it looks like there is 1 extra comm created [for MATHYPRE] for each
> PCHYPRE that is created [which also creates one comm for this object].
>
>
> You’re right; however, it was the same before the commit.
> I don’t understand how this specific commit is related with this issue,
> being the error not in the MPI_Comm_Dup which is inside MatCreate_MATHYPRE.
> Actually, the error comes from MPI_Comm_create
>
>
>
>
>
> *frame #5: 0x0001068defd4 libmpi.12.dylib`MPI_Comm_create +
> 3492frame #6: 0x0001061345d9
> libpetsc.3.07.dylib`hypre_GenerateSubComm(comm=-1006627852,
> participate=, new_comm_ptr=) + 409 at
> gen_redcs_mat.c:531 [opt]frame #7: 0x00010618f8ba
> libpetsc.3.07.dylib`hypre_GaussElimSetup(amg_data=0x7fe7ff857a00,
> level=, relax_type=9) + 74 at par_relax.c:4209 [opt]frame
> #8: 0x000106140e93
> libpetsc.3.07.dylib`hypre_BoomerAMGSetup(amg_vdata=,
> A=0x7fe80842aff0, f=0x7fe80842a980, u=0x7fe80842a510) + 17699
> at par_amg_setup.c:2108 [opt]frame #9: 0x000105ec773c
> libpetsc.3.07.dylib`PCSetUp_HYPRE(pc=) + 2540 at hypre.c:226
> [opt*
>
> How did you perform the bisection? make clean + make all ? Which version
> of HYPRE are you using?
>

I did more aggressively.

"rm -rf  arch-darwin-c-opt-bisect   "

"./configure  --optionsModule=config.compilerOptions -with-debugging=no
--with-shared-libraries=1 --with-mpi=1 --download-fblaslapack=1
--download-metis=1 --download-parmetis=1 --download-superlu_dist=1
--download-hypre=1 --download-mumps=1 --download-scalapack=1
PETSC_ARCH=arch-darwin-c-opt-bisect"


HYPRE verison:


self.gitcommit = 'v2.11.1-55-g2ea0e43'
self.download  = ['git://https://github.com/LLNL/hypre','
https://github.com/LLNL/hypre/archive/'+self.gitcommit+'.tar.gz']


I do not think this is caused by HYPRE.

Fande,



>
> But you might want to verify [by linking with mpi trace library?]
>
>
> There are some debugging hints at https://lists.mpich.org/
> pipermail/discuss/2012-December/000148.html [wrt mpich] - which I haven't
> checked..
>
> Satish
>
>
>
> Fande,
>
>
>
>
>To debug the hypre/duplication issue in MOOSE I would run in the
> debugger with a break point in MPI_Comm_dup() and see
> who keeps calling it an unreasonable amount of times. (My guess is this is
> a new "feature" in hypre that they will need to fix but only debugging will
> tell)
>
>   Barry
>
>
> On Apr 2, 2018, at 7:44 PM, Balay, Satish  wrote:
>
> We do a MPI_Comm_dup() for objects related to externalpackages.
>
> Looks like we added a new mat type MATHYPRE - in 3.8 that PCHYPRE is
> using. Previously there was one MPI_Comm_dup() PCHYPRE - now I think
> is one more for MATHYPRE - so more calls to MPI_Comm_dup in 3.8 vs 3.7
>
> src/dm/impls/da/hypre/mhyp.c:  ierr = MPI_Comm_dup(PetscObjectComm((
>
> PetscObject)B),&(ex->hcomm));CHKERRQ(ierr);
>
> src/dm/impls/da/hypre/mhyp.c:  ierr = MPI_Comm_dup(PetscObjectComm((
>
> PetscObject)B),&(ex->hcomm));CHKERRQ(ierr);
>
> src/dm/impls/swarm/data_ex.c:  ierr = MPI_Comm_dup(comm,>comm);
>
> CHKERRQ(ierr);
>
> src/ksp/pc/impls/hypre/hypre.c:  ierr = MPI_Comm_dup(PetscObjectComm((
>
> PetscObject)pc),&(jac->comm_hypre));CHKERRQ(ierr);
>
> src/ksp/pc/impls/hypre/hypre.c:  ierr = MPI_Comm_dup(PetscObjectComm((
>
> PetscObject)pc),&(ex->hcomm));CHKERRQ(ierr);
>
> src/ksp/pc/impls/hypre/hypre.c:  ierr = MPI_Comm_dup(PetscObjectComm((
>
> PetscObject)pc),&(ex->hcomm));CHKERRQ(ierr);
>
> src/ksp/pc/impls/spai/ispai.c:  ierr  =
>
> MPI_Comm_dup(PetscObjectComm((PetscObject)pc),&(ispai->comm_
> spai));CHKERRQ(ierr);
>
> src/mat/examples/tests/ex152.c:  ierr   = MPI_Comm_dup(MPI_COMM_WORLD,
>
> );CHKERRQ(ierr);
>
> src/mat/impls/aij/mpi/mkl_cpardiso/mkl_cpardiso.c:  ierr =
>
> MPI_Comm_dup(PetscObjectComm((PetscObject)A),&(mat_mkl_
> cpardiso->comm_mkl_cpardiso));CHKERRQ(ierr);
>
> src/mat/impls/aij/mpi/mumps/mumps.c:  ierr =
>
> MPI_Comm_dup(PetscObjectComm((PetscObject)A),&(mumps->comm_
> mumps));CHKERRQ(ierr);
>
> src/mat/impls/aij/mpi/pastix/pastix.c:ierr =
>
> MPI_Comm_dup(PetscObjectComm((PetscObject)A),&(lu->pastix_
> 

Re: [petsc-users] slepc-master does not configure correctly

2018-03-22 Thread Fande Kong
Thanks, Jose,

Works fine.

Thanks,

Fande,

On Thu, Mar 22, 2018 at 2:43 AM, Jose E. Roman <jro...@dsic.upv.es> wrote:

> Fixed
> https://bitbucket.org/slepc/slepc/commits/464bcc967aa18470486aba71868e0a
> e158c3fe49
>
>
> > El 22 mar 2018, a las 3:23, Satish Balay <ba...@mcs.anl.gov> escribió:
> >
> > The primary change is -  DESTDIR in petscvariables is replaced with
> PREFIXDIR
> >
> > i.e:
> >
> > diff --git a/config/packages/petsc.py b/config/packages/petsc.py
> > index e89779e15..f2577e4c6 100644
> > --- a/config/packages/petsc.py
> > +++ b/config/packages/petsc.py
> > @@ -90,8 +90,8 @@ class PETSc(package.Package):
> >   self.precision = v
> > elif k == 'MAKE':
> >   self.make = v
> > -elif k == 'DESTDIR':
> > +elif k == 'PREFIXDIR':
> >   self.destdir = v
> > elif k == 'BFORT':
> >   self.bfort = v
> > elif k == 'TEST_RUNS':
> >
> > But then 'self.destdir' should be replaced by a more appropriate name
> > 'self.prefixdir' [and update its usage from other source files]
> >
> > Satish
> >
> >
> > On Thu, 22 Mar 2018, Jed Brown wrote:
> >
> >> Yes, DESTDIR is something that is only used during "make install".  If
> >> you had a prefix install of PETSc, it should get PETSC_DIR (set to that
> >> prefix) and empty PETSC_ARCH.
> >>
> >> "Kong, Fande" <fande.k...@inl.gov> writes:
> >>
> >>> Hi All,
> >>>
> >>> ~/projects/slepc]> PETSC_ARCH=arch-darwin-c-debug-master ./configure
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *Checking environment...Traceback (most recent call last):  File
> >>> "./configure", line 10, in 
> >>> execfile(os.path.join(os.path.dirname(__file__), 'config',
> >>> 'configure.py'))  File "./config/configure.py", line 206, in 
> >>> log.write('PETSc install directory: '+petsc.destdir)AttributeError:
> PETSc
> >>> instance has no attribute 'destdir'*
> >>>
> >>>
> >>>
> >>> SLEPc may be needed to synchronized for new changes in PETSc.
> >>>
> >>> Thanks,
> >>>
> >>> Fande Kong
> >>
> >
>
>


  1   2   3   >