from:"Kong, Fande"

Re: [petsc-users] "snes_type test" is gone?

2018-06-04 Thread Kong, Fande

Thanks, Hong,

I see. It is better if "-snes_type test" did not exist at the first place.


Fande,

On Mon, Jun 4, 2018 at 4:01 PM, Zhang, Hong  wrote:

>
>
> > On Jun 4, 2018, at 4:59 PM, Zhang, Hong  wrote:
> >
> > -snes_type has been removed. We can just use -snes_test_jacobian
> instead. Note that the test is done every time the Jacobian is computed.
>
> It was meant to be "-snes_type test".
>
> > Hong (Mr.)
> >
> >> On Jun 4, 2018, at 3:27 PM, Kong, Fande  wrote:
> >>
> >> Hi PETSc Team,
> >>
> >> I was wondering if "snes_type test" has been gone? Quite a few MOOSE
> users use this option to test their Jacobian matrices.
> >>
> >> If it is gone, any reason?
> >>
> >> Fande,
> >
>
>

[petsc-users] "snes_type test" is gone?

2018-06-04 Thread Kong, Fande

Hi PETSc Team,

I was wondering if "snes_type test" has been gone? Quite a few MOOSE users
use this option to test their Jacobian matrices.

If it is gone, any reason?

Fande,

Re: [petsc-users] Could not determine how to create a shared library!

2018-05-03 Thread Kong, Fande

On Thu, May 3, 2018 at 11:50 AM, Zhang, Hong <hongzh...@anl.gov> wrote:

> Alternatively you can use --with-blaslapack-dir=/opt/
> intel/compilers_and_libraries/linux/mkl/lib/intel64 to let petsc pick the
> right libs for you.
>

 This used to work, but does not work any more.

Thanks,

Fande,


>
> Hong (Mr.)
>
>
> On May 3, 2018, at 11:32 AM, Fande Kong <fdkong...@gmail.com> wrote:
>
> --with-blaslapack-lib=-mkl -L' + os.environ['MKLROOT'] + '/lib/intel64
>
> works.
>
> Fande,
>
> On Thu, May 3, 2018 at 10:09 AM, Satish Balay <ba...@mcs.anl.gov> wrote:
>
>> Ok you are not 'building blaslapack' - but using mkl [as per
>> configure.log].
>>
>> I'll have to check the issue. It might be something to do with using
>> mkl as a static library..
>>
>> Hong might have some suggestions wrt theta builds.
>>
>> Satish
>>
>> On Thu, 3 May 2018, Satish Balay wrote:
>>
>> > Perhaps you should use MKL on theta? Again check
>> config/examples/arch-cray-xc40-knl-opt.py
>> >
>> > Satish
>> >
>> > On Thu, 3 May 2018, Kong, Fande wrote:
>> >
>> > > Thanks,
>> > >
>> > > I get the PETSc complied, but theta does not like the shared lib, I
>> think.
>> > >
>> > > I am switching back to a static lib.   I ever successfully built and
>> ran
>> > > the PETSc with the static compiling.
>> > >
>> > > But I encountered a problem this time on building blaslapack.
>> > >
>> > >
>> > > Thanks,
>> > >
>> > > Fande
>> > >
>> > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay <ba...@mcs.anl.gov>
>> wrote:
>> > >
>> > > > This is theta..
>> > > >
>> > > > Try: using --LDFLAGS=-dynamic option
>> > > >
>> > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py]
>> > > >
>> > > > Satish
>> > > >
>> > > > On Tue, 1 May 2018, Kong, Fande wrote:
>> > > >
>> > > > > Hi All,
>> > > > >
>> > > > > I can build a static petsc library on a supercomputer, but could
>> not do
>> > > > the
>> > > > > same thing with " --with-shared-libraries=1".
>> > > > >
>> > > > > The log file is attached.
>> > > > >
>> > > > >
>> > > > > Fande,
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>> >
>>
>>
>
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-05 Thread Kong, Fande

Hi Cormac,

Thanks so much! It is working now.

Fande,

On Thu, Apr 5, 2018 at 11:21 AM, Garvey, Cormac T <cormac.gar...@inl.gov>
wrote:

> I made some changes to the falcon cluster environment. Please try the
> default git (i.e Without loading a module) and
> see if petsc will install correctly.
>
> Thanks,
> Cormac.
>
> On Wed, Apr 4, 2018 at 5:09 PM, Fande Kong <fdkong...@gmail.com> wrote:
>
>> The default git gives me:
>>
>> *Could not execute "['git', 'rev-parse', '--git-dir']"*
>>
>> when I am configuring PETSc.
>>
>> The manually loaded *gits*  work just fine.
>>
>>
>> Fande,
>>
>>
>> On Wed, Apr 4, 2018 at 5:04 PM, Garvey, Cormac T <cormac.gar...@inl.gov>
>> wrote:
>>
>>> I though it was fixed, yes I will look into it again.
>>>
>>> Do you get an error just doing a git clone on falcon1 and falcon2?
>>>
>>> On Wed, Apr 4, 2018 at 4:48 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>>>
>>>> module load git/2.16.2-GCCcore-5.4.0"  also works.
>>>>
>>>> Could you somehow make the default git work as well? Hence we do not
>>>> need to have this extra "module load for git"
>>>>
>>>> Fande,
>>>>
>>>> On Wed, Apr 4, 2018 at 4:43 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>>>>
>>>>> Thanks, Cormac,
>>>>>
>>>>> *module load git/1.8.5.2-GCC-4.8.3 *
>>>>>
>>>>> works for me.
>>>>>
>>>>> Did not try "module load git/2.16.2-GCCcore-5.4.0" yet.
>>>>>
>>>>> I will try, and get it back here.
>>>>>
>>>>>
>>>>>
>>>>> Fande
>>>>>
>>>>> On Wed, Apr 4, 2018 at 4:39 PM, Garvey, Cormac T <
>>>>> cormac.gar...@inl.gov> wrote:
>>>>>
>>>>>>
>>>>>> We needed to rebuilt git on the INL falcon cluster because github
>>>>>> server changed such that it no longer accepted TLSv1.
>>>>>>
>>>>>> The default git on the falcon cluster /usr/bin/git is just a wrapper
>>>>>> script, so users would not need to load any modules to
>>>>>> use git.
>>>>>>
>>>>>> When you load load git on falcon1 or falcon2 does it still fail?
>>>>>>
>>>>>> module load git/2.16.2-GCCcore-5.4.0
>>>>>>
>>>>>> Thanks,
>>>>>> Cormac.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 4, 2018 at 4:28 PM, Kong, Fande <fande.k...@inl.gov>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Cormac,
>>>>>>>
>>>>>>> Do you know anything on "git"? How did you guys build git on the
>>>>>>> falcon1?  The default git on Falcon1 does not work with petsc any more.
>>>>>>>
>>>>>>>
>>>>>>> Fande,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 4, 2018 at 4:20 PM, Satish Balay <ba...@mcs.anl.gov>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ok - I don't think I have access to this OS.
>>>>>>>>
>>>>>>>> And I see its from 2009 [sure its enterprise os - with regular
>>>>>>>> backport updates]
>>>>>>>>
>>>>>>>> But its wierd that you have such a new version of git at
>>>>>>>> /usr/bin/git
>>>>>>>>
>>>>>>>> From what we know so far - the problem appears to be some bad
>>>>>>>> interaction of python-2.6 with this old OS [i.e old glibc etc..] -
>>>>>>>> and
>>>>>>>> this new git version [binary built locally or on a different OS and
>>>>>>>> installed locally ?].
>>>>>>>>
>>>>>>>> Satish
>>>>>>>>
>>>>>>>> On Wed, 4 Apr 2018, Kong, Fande wrote:
>>>>>>>>
>>>>>>>> >  moose]$ uname -a
>>>>>>>> > Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40
>>>>>>>> UTC 2017
>>>>&

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

module load git/2.16.2-GCCcore-5.4.0"  also works.

Could you somehow make the default git work as well? Hence we do not need
to have this extra "module load for git"

Fande,

On Wed, Apr 4, 2018 at 4:43 PM, Kong, Fande <fande.k...@inl.gov> wrote:

> Thanks, Cormac,
>
> *module load git/1.8.5.2-GCC-4.8.3 *
>
> works for me.
>
> Did not try "module load git/2.16.2-GCCcore-5.4.0" yet.
>
> I will try, and get it back here.
>
>
>
> Fande
>
> On Wed, Apr 4, 2018 at 4:39 PM, Garvey, Cormac T <cormac.gar...@inl.gov>
> wrote:
>
>>
>> We needed to rebuilt git on the INL falcon cluster because github server
>> changed such that it no longer accepted TLSv1.
>>
>> The default git on the falcon cluster /usr/bin/git is just a wrapper
>> script, so users would not need to load any modules to
>> use git.
>>
>> When you load load git on falcon1 or falcon2 does it still fail?
>>
>> module load git/2.16.2-GCCcore-5.4.0
>>
>> Thanks,
>> Cormac.
>>
>>
>>
>> On Wed, Apr 4, 2018 at 4:28 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>>
>>> Hi Cormac,
>>>
>>> Do you know anything on "git"? How did you guys build git on the
>>> falcon1?  The default git on Falcon1 does not work with petsc any more.
>>>
>>>
>>> Fande,
>>>
>>>
>>>
>>> On Wed, Apr 4, 2018 at 4:20 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>>>
>>>> Ok - I don't think I have access to this OS.
>>>>
>>>> And I see its from 2009 [sure its enterprise os - with regular backport
>>>> updates]
>>>>
>>>> But its wierd that you have such a new version of git at /usr/bin/git
>>>>
>>>> From what we know so far - the problem appears to be some bad
>>>> interaction of python-2.6 with this old OS [i.e old glibc etc..] - and
>>>> this new git version [binary built locally or on a different OS and
>>>> installed locally ?].
>>>>
>>>> Satish
>>>>
>>>> On Wed, 4 Apr 2018, Kong, Fande wrote:
>>>>
>>>> >  moose]$ uname -a
>>>> > Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40 UTC
>>>> 2017
>>>> > (294ccfe) x86_64 x86_64 x86_64 GNU/Linux
>>>> >
>>>> >
>>>> > moose]$ lsb_release -a
>>>> > LSB Version:
>>>> > core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86
>>>> _64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:deskto
>>>> p-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics
>>>> -3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
>>>> > Distributor ID:SUSE LINUX
>>>> > Description:SUSE Linux Enterprise Server 11 (x86_64)
>>>> > Release:11
>>>> > Codename:n/a
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Apr 4, 2018 at 4:08 PM, Satish Balay <ba...@mcs.anl.gov>
>>>> wrote:
>>>> >
>>>> > > On Wed, 4 Apr 2018, Satish Balay wrote:
>>>> > >
>>>> > > > On Wed, 4 Apr 2018, Satish Balay wrote:
>>>> > > >
>>>> > > > > was your '2.16.2' version installed from source?
>>>> > > >
>>>> > > > >>>
>>>> > > > Checking for program /usr/bin/git...found
>>>> > > > Defined make macro "GIT" to "git"
>>>> > > > Executing: git --version
>>>> > > > stdout: git version 2.16.2
>>>> > > > <<<<
>>>> > > >
>>>> > > > I gues its the OS default package
>>>> > > >
>>>> > > > >>>>>
>>>> > > > Machine platform:
>>>> > > > ('Linux', 'falcon1', '3.0.101-108.13-default', '#1 SMP Wed Oct 11
>>>> > > 12:30:40 UTC 2017 (294ccfe)', 'x86_64', 'x86_64')
>>>> > > > Python version:
>>>> > > > 2.6.9 (unknown, Aug  5 2016, 11:15:31)
>>>> > > > [GCC 4.3.4 [gcc-4_3-branch revision 152973]]
>>>> > > > <<<<
>>>> > > >
>>>> > > > What OS/version is on this machine? I can try reproducing in a VM
>>>> > >
>>>> > > It is strange that the kernel is old [3.0 - perhaps LTS OS] ,
>>>> python is
>>>> > > old [2.6] - but git is new? [2.16?]
>>>> > >
>>>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__
>>>> > > mirrors.edge.kernel.org_pub_software_scm_git_=DwIBAg=
>>>> > > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
>>>> > > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=WF6ZxIh9Z9Hh2kYY0w70aNrgfPicp6
>>>> > > kgIh5BvezPiEY=KWMsO7XC0-pQuQ_mD03tNJWEpxSTATlZW_DmX0QofGw=
>>>> > > git-2.16.2.tar.gz  16-Feb-2018 17:48
>>>> > > 7M
>>>> > >
>>>> > > Satish
>>>> > >
>>>> >
>>>>
>>>>
>>>
>>
>>
>> --
>> Cormac Garvey
>> HPC Software Consultant
>> Scientific Computing
>> Idaho National Laboratory
>> Ph: 208-526-6294
>>
>>
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

Thanks, Cormac,

*module load git/1.8.5.2-GCC-4.8.3 *

works for me.

Did not try "module load git/2.16.2-GCCcore-5.4.0" yet.

I will try, and get it back here.



Fande

On Wed, Apr 4, 2018 at 4:39 PM, Garvey, Cormac T <cormac.gar...@inl.gov>
wrote:

>
> We needed to rebuilt git on the INL falcon cluster because github server
> changed such that it no longer accepted TLSv1.
>
> The default git on the falcon cluster /usr/bin/git is just a wrapper
> script, so users would not need to load any modules to
> use git.
>
> When you load load git on falcon1 or falcon2 does it still fail?
>
> module load git/2.16.2-GCCcore-5.4.0
>
> Thanks,
> Cormac.
>
>
>
> On Wed, Apr 4, 2018 at 4:28 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>> Hi Cormac,
>>
>> Do you know anything on "git"? How did you guys build git on the
>> falcon1?  The default git on Falcon1 does not work with petsc any more.
>>
>>
>> Fande,
>>
>>
>>
>> On Wed, Apr 4, 2018 at 4:20 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>>
>>> Ok - I don't think I have access to this OS.
>>>
>>> And I see its from 2009 [sure its enterprise os - with regular backport
>>> updates]
>>>
>>> But its wierd that you have such a new version of git at /usr/bin/git
>>>
>>> From what we know so far - the problem appears to be some bad
>>> interaction of python-2.6 with this old OS [i.e old glibc etc..] - and
>>> this new git version [binary built locally or on a different OS and
>>> installed locally ?].
>>>
>>> Satish
>>>
>>> On Wed, 4 Apr 2018, Kong, Fande wrote:
>>>
>>> >  moose]$ uname -a
>>> > Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40 UTC
>>> 2017
>>> > (294ccfe) x86_64 x86_64 x86_64 GNU/Linux
>>> >
>>> >
>>> > moose]$ lsb_release -a
>>> > LSB Version:
>>> > core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86
>>> _64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:deskt
>>> op-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphic
>>> s-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
>>> > Distributor ID:SUSE LINUX
>>> > Description:SUSE Linux Enterprise Server 11 (x86_64)
>>> > Release:11
>>> > Codename:n/a
>>> >
>>> >
>>> >
>>> > On Wed, Apr 4, 2018 at 4:08 PM, Satish Balay <ba...@mcs.anl.gov>
>>> wrote:
>>> >
>>> > > On Wed, 4 Apr 2018, Satish Balay wrote:
>>> > >
>>> > > > On Wed, 4 Apr 2018, Satish Balay wrote:
>>> > > >
>>> > > > > was your '2.16.2' version installed from source?
>>> > > >
>>> > > > >>>
>>> > > > Checking for program /usr/bin/git...found
>>> > > > Defined make macro "GIT" to "git"
>>> > > > Executing: git --version
>>> > > > stdout: git version 2.16.2
>>> > > > <<<<
>>> > > >
>>> > > > I gues its the OS default package
>>> > > >
>>> > > > >>>>>
>>> > > > Machine platform:
>>> > > > ('Linux', 'falcon1', '3.0.101-108.13-default', '#1 SMP Wed Oct 11
>>> > > 12:30:40 UTC 2017 (294ccfe)', 'x86_64', 'x86_64')
>>> > > > Python version:
>>> > > > 2.6.9 (unknown, Aug  5 2016, 11:15:31)
>>> > > > [GCC 4.3.4 [gcc-4_3-branch revision 152973]]
>>> > > > <<<<
>>> > > >
>>> > > > What OS/version is on this machine? I can try reproducing in a VM
>>> > >
>>> > > It is strange that the kernel is old [3.0 - perhaps LTS OS] , python
>>> is
>>> > > old [2.6] - but git is new? [2.16?]
>>> > >
>>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__
>>> > > mirrors.edge.kernel.org_pub_software_scm_git_=DwIBAg=
>>> > > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
>>> > > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=WF6ZxIh9Z9Hh2kYY0w70aNrgfPicp6
>>> > > kgIh5BvezPiEY=KWMsO7XC0-pQuQ_mD03tNJWEpxSTATlZW_DmX0QofGw=
>>> > > git-2.16.2.tar.gz  16-Feb-2018 17:48
>>> > > 7M
>>> > >
>>> > > Satish
>>> > >
>>> >
>>>
>>>
>>
>
>
> --
> Cormac Garvey
> HPC Software Consultant
> Scientific Computing
> Idaho National Laboratory
> Ph: 208-526-6294
>
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

Hi Cormac,

Do you know anything on "git"? How did you guys build git on the falcon1?
The default git on Falcon1 does not work with petsc any more.


Fande,



On Wed, Apr 4, 2018 at 4:20 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> Ok - I don't think I have access to this OS.
>
> And I see its from 2009 [sure its enterprise os - with regular backport
> updates]
>
> But its wierd that you have such a new version of git at /usr/bin/git
>
> From what we know so far - the problem appears to be some bad
> interaction of python-2.6 with this old OS [i.e old glibc etc..] - and
> this new git version [binary built locally or on a different OS and
> installed locally ?].
>
> Satish
>
> On Wed, 4 Apr 2018, Kong, Fande wrote:
>
> >  moose]$ uname -a
> > Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40 UTC 2017
> > (294ccfe) x86_64 x86_64 x86_64 GNU/Linux
> >
> >
> > moose]$ lsb_release -a
> > LSB Version:
> > core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.
> 0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:
> desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:
> graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:
> graphics-4.0-noarch
> > Distributor ID:SUSE LINUX
> > Description:SUSE Linux Enterprise Server 11 (x86_64)
> > Release:11
> > Codename:n/a
> >
> >
> >
> > On Wed, Apr 4, 2018 at 4:08 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> >
> > > On Wed, 4 Apr 2018, Satish Balay wrote:
> > >
> > > > On Wed, 4 Apr 2018, Satish Balay wrote:
> > > >
> > > > > was your '2.16.2' version installed from source?
> > > >
> > > > >>>
> > > > Checking for program /usr/bin/git...found
> > > > Defined make macro "GIT" to "git"
> > > > Executing: git --version
> > > > stdout: git version 2.16.2
> > > > <<<<
> > > >
> > > > I gues its the OS default package
> > > >
> > > > >>>>>
> > > > Machine platform:
> > > > ('Linux', 'falcon1', '3.0.101-108.13-default', '#1 SMP Wed Oct 11
> > > 12:30:40 UTC 2017 (294ccfe)', 'x86_64', 'x86_64')
> > > > Python version:
> > > > 2.6.9 (unknown, Aug  5 2016, 11:15:31)
> > > > [GCC 4.3.4 [gcc-4_3-branch revision 152973]]
> > > > <<<<
> > > >
> > > > What OS/version is on this machine? I can try reproducing in a VM
> > >
> > > It is strange that the kernel is old [3.0 - perhaps LTS OS] , python is
> > > old [2.6] - but git is new? [2.16?]
> > >
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__
> > > mirrors.edge.kernel.org_pub_software_scm_git_=DwIBAg=
> > > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
> > > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=WF6ZxIh9Z9Hh2kYY0w70aNrgfPicp6
> > > kgIh5BvezPiEY=KWMsO7XC0-pQuQ_mD03tNJWEpxSTATlZW_DmX0QofGw=
> > > git-2.16.2.tar.gz  16-Feb-2018 17:48
> > > 7M
> > >
> > > Satish
> > >
> >
>
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

 moose]$ uname -a
Linux falcon1 3.0.101-108.13-default #1 SMP Wed Oct 11 12:30:40 UTC 2017
(294ccfe) x86_64 x86_64 x86_64 GNU/Linux


moose]$ lsb_release -a
LSB Version:
core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID:SUSE LINUX
Description:SUSE Linux Enterprise Server 11 (x86_64)
Release:11
Codename:n/a



On Wed, Apr 4, 2018 at 4:08 PM, Satish Balay  wrote:

> On Wed, 4 Apr 2018, Satish Balay wrote:
>
> > On Wed, 4 Apr 2018, Satish Balay wrote:
> >
> > > was your '2.16.2' version installed from source?
> >
> > >>>
> > Checking for program /usr/bin/git...found
> > Defined make macro "GIT" to "git"
> > Executing: git --version
> > stdout: git version 2.16.2
> > 
> >
> > I gues its the OS default package
> >
> > >
> > Machine platform:
> > ('Linux', 'falcon1', '3.0.101-108.13-default', '#1 SMP Wed Oct 11
> 12:30:40 UTC 2017 (294ccfe)', 'x86_64', 'x86_64')
> > Python version:
> > 2.6.9 (unknown, Aug  5 2016, 11:15:31)
> > [GCC 4.3.4 [gcc-4_3-branch revision 152973]]
> > 
> >
> > What OS/version is on this machine? I can try reproducing in a VM
>
> It is strange that the kernel is old [3.0 - perhaps LTS OS] , python is
> old [2.6] - but git is new? [2.16?]
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__
> mirrors.edge.kernel.org_pub_software_scm_git_=DwIBAg=
> 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
> JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=WF6ZxIh9Z9Hh2kYY0w70aNrgfPicp6
> kgIh5BvezPiEY=KWMsO7XC0-pQuQ_mD03tNJWEpxSTATlZW_DmX0QofGw=
> git-2.16.2.tar.gz  16-Feb-2018 17:48
> 7M
>
> Satish
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

On Wed, Apr 4, 2018 at 3:55 PM, Jed Brown <j...@jedbrown.org> wrote:

> "Kong, Fande" <fande.k...@inl.gov> writes:
>
> > Updated:
> >
> > If switch to a different version of git. The configuration is going to
> > work.
> >
> >
> > This one works:
> >
> >
> > * petsc]$ git --version git version 1.8.5.2*
> >
> >
> > This new version does not work:
> >
> >  petsc]$ git --version
> > git version 2.16.2
>
> Can you reproduce the error in your terminal or does it only happen
> through configure?
>

It only happens using configure.

Fande,

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-04-04 Thread Kong, Fande

Updated:

If switch to a different version of git. The configuration is going to
work.

This one works:

* petsc]$ git --version git version 1.8.5.2*

This new version does not work:

 petsc]$ git --version
git version 2.16.2

Fande.

On Wed, Mar 7, 2018 at 2:56 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> On Wed, 7 Mar 2018, Kong, Fande wrote:
>
> > > I meant just the 3 lines - not the whole function.
> >
> > I knew this. "3 lines" does not work at all.
> >
> > I forgot the error message.
>
> Then you are likely to use the wrong [git] snapshot - and not the
> snapshot listed by self.gitcommit - for that package.
>
> Satish
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

It looks nice for me.

Fande,

On Tue, Apr 3, 2018 at 3:04 PM, Stefano Zampini <stefano.zamp...@gmail.com>
wrote:

> What about
>
> PetscCommGetPkgComm(MPI_Comm comm ,const char* package, MPI_Comm* pkgcomm)
>
> with a key for each of the external packages PETSc can use?
>
>
> On Apr 3, 2018, at 10:56 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>
> I think we could add an inner comm for external package. If the same comm
> is passed in again, we just retrieve the same communicator, instead of
> MPI_Comm_dup(), for that external package (at least HYPRE team claimed
> this will be fine).   I did not see any issue with this idea so far.
>
> I might be missing something here
>
>
> Fande,
>
> On Tue, Apr 3, 2018 at 1:45 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>
>> On Tue, 3 Apr 2018, Smith, Barry F. wrote:
>>
>> >
>> >
>> > > On Apr 3, 2018, at 11:59 AM, Balay, Satish <ba...@mcs.anl.gov> wrote:
>> > >
>> > > On Tue, 3 Apr 2018, Smith, Barry F. wrote:
>> > >
>> > >>   Note that PETSc does one MPI_Comm_dup() for each hypre matrix.
>> Internally hypre does at least one MPI_Comm_create() per hypre boomerAMG
>> solver. So even if PETSc does not do the MPI_Comm_dup() you will still be
>> limited due to hypre's MPI_Comm_create.
>> > >>
>> > >>I will compose an email to hypre cc:ing everyone to get
>> information from them.
>> > >
>> > > Actually I don't see any calls to MPI_Comm_dup() in hypre sources
>> [there are stubs for it for non-mpi build]
>> > >
>> > > There was that call to MPI_Comm_create() in the stack trace [via
>> hypre_BoomerAMGSetup]
>> >
>> >This is what I said. The MPI_Comm_create() is called for each solver
>> and hence uses a slot for each solver.
>>
>> Ops sorry - misread the text..
>>
>> Satish
>>
>
>
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

I think we could add an inner comm for external package. If the same comm
is passed in again, we just retrieve the same communicator, instead of
MPI_Comm_dup(), for that external package (at least HYPRE team claimed this
will be fine).   I did not see any issue with this idea so far.

I might be missing something here


Fande,

On Tue, Apr 3, 2018 at 1:45 PM, Satish Balay  wrote:

> On Tue, 3 Apr 2018, Smith, Barry F. wrote:
>
> >
> >
> > > On Apr 3, 2018, at 11:59 AM, Balay, Satish  wrote:
> > >
> > > On Tue, 3 Apr 2018, Smith, Barry F. wrote:
> > >
> > >>   Note that PETSc does one MPI_Comm_dup() for each hypre matrix.
> Internally hypre does at least one MPI_Comm_create() per hypre boomerAMG
> solver. So even if PETSc does not do the MPI_Comm_dup() you will still be
> limited due to hypre's MPI_Comm_create.
> > >>
> > >>I will compose an email to hypre cc:ing everyone to get
> information from them.
> > >
> > > Actually I don't see any calls to MPI_Comm_dup() in hypre sources
> [there are stubs for it for non-mpi build]
> > >
> > > There was that call to MPI_Comm_create() in the stack trace [via
> hypre_BoomerAMGSetup]
> >
> >This is what I said. The MPI_Comm_create() is called for each solver
> and hence uses a slot for each solver.
>
> Ops sorry - misread the text..
>
> Satish
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

On Tue, Apr 3, 2018 at 11:29 AM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>   Fande,
>
>  The reason for MPI_Comm_dup() and the inner communicator is that this
> communicator is used by hypre and so cannot "just" be a PETSc communicator.
> We cannot have PETSc and hypre using the same communicator since they may
> capture each others messages etc.
>
>   See my pull request that I think should resolve the issue in the
> short term,
>

Yes, it helps as well.

The question becomes we can not have more than 2000 AMG solvers in one
application because each Hypre owns its communicator.  There is no way to
have all AMG solvers share the same HYPRE-sided communicator? Just like
what we are dong for PETSc objects?


Fande,



>
> Barry
>
>
> > On Apr 3, 2018, at 11:21 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Figured out:
> >
> > The reason is that  in  MatCreate_HYPRE(Mat B), we call MPI_Comm_dup
> instead of PetscCommDuplicate. The PetscCommDuplicate is better, and it
> does not actually create a communicator if the communicator is already
> known to PETSc.
> >
> > Furthermore, I do not think we should a comm in
> >
> > typedef struct {
> >   HYPRE_IJMatrix ij;
> >   HYPRE_IJVector x;
> >   HYPRE_IJVector b;
> >   MPI_Comm   comm;
> > } Mat_HYPRE;
> >
> > It is an inner data of Mat, and it should already the same comm as the
> Mat. I do not understand why the internal data has its own comm.
> >
> > The following patch fixed the issue (just deleted this extra comm).
> >
> > diff --git a/src/mat/impls/hypre/mhypre.c b/src/mat/impls/hypre/mhypre.c
> > index dc19892..d8cfe3d 100644
> > --- a/src/mat/impls/hypre/mhypre.c
> > +++ b/src/mat/impls/hypre/mhypre.c
> > @@ -74,7 +74,7 @@ static PetscErrorCode MatHYPRE_CreateFromMat(Mat A,
> Mat_HYPRE *hA)
> >rend   = A->rmap->rend;
> >cstart = A->cmap->rstart;
> >cend   = A->cmap->rend;
> > -  PetscStackCallStandard(HYPRE_IJMatrixCreate,(hA->comm,
> rstart,rend-1,cstart,cend-1,>ij));
> > +  PetscStackCallStandard(HYPRE_IJMatrixCreate,(
> PetscObjectComm((PetscObject)A),rstart,rend-1,cstart,cend-1,>ij));
> >PetscStackCallStandard(HYPRE_IJMatrixSetObjectType,(hA->ij,
> HYPRE_PARCSR));
> >{
> >  PetscBool  same;
> > @@ -434,7 +434,7 @@ PetscErrorCode MatDestroy_HYPRE(Mat A)
> >if (hA->x) PetscStackCallStandard(HYPRE_IJVectorDestroy,(hA->x));
> >if (hA->b) PetscStackCallStandard(HYPRE_IJVectorDestroy,(hA->b));
> >if (hA->ij) PetscStackCallStandard(HYPRE_IJMatrixDestroy,(hA->ij));
> > -  if (hA->comm) { ierr = MPI_Comm_free(>comm);CHKERRQ(ierr);}
> > +  /*if (hA->comm) { ierr = MPI_Comm_free(>comm);CHKERRQ(ierr);}*/
> >ierr = PetscObjectComposeFunction((PetscObject)A,"MatConvert_
> hypre_aij_C",NULL);CHKERRQ(ierr);
> >ierr = PetscFree(A->data);CHKERRQ(ierr);
> >PetscFunctionReturn(0);
> > @@ -500,7 +500,8 @@ PETSC_EXTERN PetscErrorCode MatCreate_HYPRE(Mat B)
> >B->ops->destroy   = MatDestroy_HYPRE;
> >B->ops->assemblyend   = MatAssemblyEnd_HYPRE;
> >
> > -  ierr = MPI_Comm_dup(PetscObjectComm((PetscObject)B),>comm);
> CHKERRQ(ierr);
> > +  /*ierr = 
> > MPI_Comm_dup(PetscObjectComm((PetscObject)B),>comm);CHKERRQ(ierr);
> */
> > +  /*ierr = PetscCommDuplicate(PetscObjectComm((PetscObject)
> B),>comm,NULL);CHKERRQ(ierr);*/
> >ierr = PetscObjectChangeTypeName((PetscObject)B,MATHYPRE);
> CHKERRQ(ierr);
> >ierr = PetscObjectComposeFunction((PetscObject)B,"MatConvert_
> hypre_aij_C",MatConvert_HYPRE_AIJ);CHKERRQ(ierr);
> >PetscFunctionReturn(0);
> > diff --git a/src/mat/impls/hypre/mhypre.h b/src/mat/impls/hypre/mhypre.h
> > index 3d9ddd2..1189020 100644
> > --- a/src/mat/impls/hypre/mhypre.h
> > +++ b/src/mat/impls/hypre/mhypre.h
> > @@ -10,7 +10,7 @@ typedef struct {
> >HYPRE_IJMatrix ij;
> >HYPRE_IJVector x;
> >HYPRE_IJVector b;
> > -  MPI_Comm   comm;
> > +  /*MPI_Comm   comm;*/
> >  } Mat_HYPRE;
> >
> >
> >
> > Fande,
> >
> >
> >
> >
> > On Tue, Apr 3, 2018 at 10:35 AM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > On Tue, 3 Apr 2018, Satish Balay wrote:
> >
> > > On Tue, 3 Apr 2018, Derek Gaston wrote:
> > >
> > > > One thing I want to be clear of here: is that we're not trying to
> solve
> > > > this particular problem (where we're creating 1000 instances o

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

Figured out:

The reason is that  in  MatCreate_HYPRE(Mat B), we call MPI_Comm_dup
instead of PetscCommDuplicate. The PetscCommDuplicate is better, and it
does not actually create a communicator if the communicator is already
known to PETSc.

Furthermore, I do not think we should a comm in






*typedef struct {  HYPRE_IJMatrix ij;  HYPRE_IJVector x;  HYPRE_IJVector
b;  MPI_Comm   comm;} Mat_HYPRE;*

It is an inner data of Mat, and it should already the same comm as the Mat.
I do not understand why the internal data has its own comm.

The following patch fixed the issue (just deleted this extra comm).

diff --git a/src/mat/impls/hypre/mhypre.c b/src/mat/impls/hypre/mhypre.c
index dc19892..d8cfe3d 100644
--- a/src/mat/impls/hypre/mhypre.c
+++ b/src/mat/impls/hypre/mhypre.c
@@ -74,7 +74,7 @@ static PetscErrorCode MatHYPRE_CreateFromMat(Mat A,
Mat_HYPRE *hA)
   rend   = A->rmap->rend;
   cstart = A->cmap->rstart;
   cend   = A->cmap->rend;
-
PetscStackCallStandard(HYPRE_IJMatrixCreate,(hA->comm,rstart,rend-1,cstart,cend-1,>ij));
+
PetscStackCallStandard(HYPRE_IJMatrixCreate,(PetscObjectComm((PetscObject)A),rstart,rend-1,cstart,cend-1,>ij));

PetscStackCallStandard(HYPRE_IJMatrixSetObjectType,(hA->ij,HYPRE_PARCSR));
   {
 PetscBool  same;
@@ -434,7 +434,7 @@ PetscErrorCode MatDestroy_HYPRE(Mat A)
   if (hA->x) PetscStackCallStandard(HYPRE_IJVectorDestroy,(hA->x));
   if (hA->b) PetscStackCallStandard(HYPRE_IJVectorDestroy,(hA->b));
   if (hA->ij) PetscStackCallStandard(HYPRE_IJMatrixDestroy,(hA->ij));
-  if (hA->comm) { ierr = MPI_Comm_free(>comm);CHKERRQ(ierr);}
+  /*if (hA->comm) { ierr = MPI_Comm_free(>comm);CHKERRQ(ierr);}*/
   ierr =
PetscObjectComposeFunction((PetscObject)A,"MatConvert_hypre_aij_C",NULL);CHKERRQ(ierr);
   ierr = PetscFree(A->data);CHKERRQ(ierr);
   PetscFunctionReturn(0);
@@ -500,7 +500,8 @@ PETSC_EXTERN PetscErrorCode MatCreate_HYPRE(Mat B)
   B->ops->destroy   = MatDestroy_HYPRE;
   B->ops->assemblyend   = MatAssemblyEnd_HYPRE;

-  ierr =
MPI_Comm_dup(PetscObjectComm((PetscObject)B),>comm);CHKERRQ(ierr);
+  /*ierr =
MPI_Comm_dup(PetscObjectComm((PetscObject)B),>comm);CHKERRQ(ierr); */
+  /*ierr =
PetscCommDuplicate(PetscObjectComm((PetscObject)B),>comm,NULL);CHKERRQ(ierr);*/
   ierr = PetscObjectChangeTypeName((PetscObject)B,MATHYPRE);CHKERRQ(ierr);
   ierr =
PetscObjectComposeFunction((PetscObject)B,"MatConvert_hypre_aij_C",MatConvert_HYPRE_AIJ);CHKERRQ(ierr);
   PetscFunctionReturn(0);
diff --git a/src/mat/impls/hypre/mhypre.h b/src/mat/impls/hypre/mhypre.h
index 3d9ddd2..1189020 100644
--- a/src/mat/impls/hypre/mhypre.h
+++ b/src/mat/impls/hypre/mhypre.h
@@ -10,7 +10,7 @@ typedef struct {
   HYPRE_IJMatrix ij;
   HYPRE_IJVector x;
   HYPRE_IJVector b;
-  MPI_Comm   comm;
+  /*MPI_Comm   comm;*/
 } Mat_HYPRE;



Fande,




On Tue, Apr 3, 2018 at 10:35 AM, Satish Balay  wrote:

> On Tue, 3 Apr 2018, Satish Balay wrote:
>
> > On Tue, 3 Apr 2018, Derek Gaston wrote:
> >
> > > One thing I want to be clear of here: is that we're not trying to solve
> > > this particular problem (where we're creating 1000 instances of Hypre
> to
> > > precondition each variable independently)... this particular problem is
> > > just a test (that we've had in our test suite for a long time) to
> stress
> > > test some of this capability.
> > >
> > > We really do have needs for thousands (tens of thousands) of
> simultaneous
> > > solves (each with their own Hypre instances).  That's not what this
> > > particular problem is doing - but it is representative of a class of
> our
> > > problems we need to solve.
> > >
> > > Which does bring up a point: I have been able to do solves before with
> > > ~50,000 separate PETSc solves without issue.  Is it because I was
> working
> > > with MVAPICH on a cluster?  Does it just have a higher limit?
> >
> > Don't know - but thats easy to find out with a simple test code..
> >
> > >>
> > $ cat comm_dup_test.c
> > #include 
> > #include 
> >
> > int main(int argc, char** argv) {
> > MPI_Comm newcomm;
> > int i, err;
> > MPI_Init(NULL, NULL);
> > for (i=0; i<10; i++) {
> >   err = MPI_Comm_dup(MPI_COMM_WORLD, );
> >   if (err) {
> >   printf("%5d - fail\n",i);fflush(stdout);
> >   break;
> > } else {
> >   printf("%5d - success\n",i);fflush(stdout);
> >   }
> > }
> > MPI_Finalize();
> > }
> > <<<
> >
> > OpenMPI fails after '65531' and mpich after '2044'. MVAPICH is derived
> > off MPICH - but its possible they have a different limit than MPICH.
>
> BTW: the above is  with: openmpi-2.1.2 and mpich-3.3b1
>
> mvapich2-1.9.5 - and I get error after '2044' comm dupes
>
> Satish
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

The first bad commit:








*commit 49a781f5cee36db85e8d5b951eec29f10ac13593Author: Stefano Zampini
<stefano.zamp...@gmail.com <stefano.zamp...@gmail.com>>Date:   Sat Nov 5
20:15:19 2016 +0300PCHYPRE: use internal Mat of type MatHYPRE
hpmat already stores two HYPRE vectors*

Hypre version:

~/projects/petsc/arch-darwin-c-opt-bisect_bad/externalpackages/git.hypre]>
git branch
* (HEAD detached at 83b1f19)



The last good commit:








*commit 63c07aad33d943fe85193412d077a1746a7c55aaAuthor: Stefano Zampini
<stefano.zamp...@gmail.com <stefano.zamp...@gmail.com>>Date:   Sat Nov 5
19:30:12 2016 +0300MatHYPRE: create new matrix typeThe
conversion from AIJ to HYPRE has been taken from
src/dm/impls/da/hypre/mhyp.cHYPRE to AIJ is new*

Hypre version:

/projects/petsc/arch-darwin-c-opt-bisect/externalpackages/git.hypre]> git
branch
* (HEAD detached at 83b1f19)





We are using the same HYPRE version.


I will narrow down line-by-line.


Fande,


On Tue, Apr 3, 2018 at 9:50 AM, Stefano Zampini <stefano.zamp...@gmail.com>
wrote:

>
> On Apr 3, 2018, at 5:43 PM, Fande Kong <fdkong...@gmail.com> wrote:
>
>
>
> On Tue, Apr 3, 2018 at 9:12 AM, Stefano Zampini <stefano.zamp...@gmail.com
> > wrote:
>
>>
>> On Apr 3, 2018, at 4:58 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>>
>> On Tue, 3 Apr 2018, Kong, Fande wrote:
>>
>> On Tue, Apr 3, 2018 at 1:17 AM, Smith, Barry F. <bsm...@mcs.anl.gov>
>> wrote:
>>
>>
>>   Each external package definitely needs its own duplicated communicator;
>> cannot share between packages.
>>
>>   The only problem with the dups below is if they are in a loop and get
>> called many times.
>>
>>
>>
>> The "standard test" that has this issue actually has 1K fields. MOOSE
>> creates its own field-split preconditioner (not based on the PETSc
>> fieldsplit), and each filed is associated with one PC HYPRE.  If PETSc
>> duplicates communicators, we should easily reach the limit 2048.
>>
>> I also want to confirm what extra communicators are introduced in the bad
>> commit.
>>
>>
>> To me it looks like there is 1 extra comm created [for MATHYPRE] for each
>> PCHYPRE that is created [which also creates one comm for this object].
>>
>>
>> You’re right; however, it was the same before the commit.
>> I don’t understand how this specific commit is related with this issue,
>> being the error not in the MPI_Comm_Dup which is inside MatCreate_MATHYPRE.
>> Actually, the error comes from MPI_Comm_create
>>
>>
>>
>>
>>
>> *frame #5: 0x0001068defd4 libmpi.12.dylib`MPI_Comm_create +
>> 3492frame #6: 0x0001061345d9
>> libpetsc.3.07.dylib`hypre_GenerateSubComm(comm=-1006627852,
>> participate=, new_comm_ptr=) + 409 at
>> gen_redcs_mat.c:531 [opt]frame #7: 0x00010618f8ba
>> libpetsc.3.07.dylib`hypre_GaussElimSetup(amg_data=0x7fe7ff857a00,
>> level=, relax_type=9) + 74 at par_relax.c:4209 [opt]frame
>> #8: 0x000106140e93
>> libpetsc.3.07.dylib`hypre_BoomerAMGSetup(amg_vdata=,
>> A=0x7fe80842aff0, f=0x7fe80842a980, u=0x7fe80842a510) + 17699
>> at par_amg_setup.c:2108 [opt]frame #9: 0x000105ec773c
>> libpetsc.3.07.dylib`PCSetUp_HYPRE(pc=) + 2540 at hypre.c:226
>> [opt*
>>
>> How did you perform the bisection? make clean + make all ? Which version
>> of HYPRE are you using?
>>
>
> I did more aggressively.
>
> "rm -rf  arch-darwin-c-opt-bisect   "
>
> "./configure  --optionsModule=config.compilerOptions -with-debugging=no
> --with-shared-libraries=1 --with-mpi=1 --download-fblaslapack=1
> --download-metis=1 --download-parmetis=1 --download-superlu_dist=1
> --download-hypre=1 --download-mumps=1 --download-scalapack=1
> PETSC_ARCH=arch-darwin-c-opt-bisect"
>
>
> Good, so this removes some possible sources of errors
>
>
> HYPRE verison:
>
>
> self.gitcommit = 'v2.11.1-55-g2ea0e43'
> self.download  = ['git://https://github.com/LLNL/hypre
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LLNL_hypre=DwMFaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=LTXwlyqefohCW3djvHLnK_QFKia-PIJn5cgBbNxC91A=K0qCoSO2uYo06lAKeKuukkC7k9R16DVQyZJTF-m23l8=>
> ','https://github.com/LLNL/hypre/archive/'+self.gitcommit+'.tar.gz
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LLNL_hypre_archive_-27-2Bself.gitcommit-2B-27.tar.gz=DwMFaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=LTXwlyqefohCW3djvHLnK_QFKia-PIJn5cgBbNxC91A=Zirg

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

On Tue, Apr 3, 2018 at 9:32 AM, Satish Balay <ba...@mcs.anl.gov> wrote:

> On Tue, 3 Apr 2018, Stefano Zampini wrote:
>
> >
> > > On Apr 3, 2018, at 4:58 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > >
> > > On Tue, 3 Apr 2018, Kong, Fande wrote:
> > >
> > >> On Tue, Apr 3, 2018 at 1:17 AM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> > >>
> > >>>
> > >>>   Each external package definitely needs its own duplicated
> communicator;
> > >>> cannot share between packages.
> > >>>
> > >>>   The only problem with the dups below is if they are in a loop and
> get
> > >>> called many times.
> > >>>
> > >>
> > >>
> > >> The "standard test" that has this issue actually has 1K fields. MOOSE
> > >> creates its own field-split preconditioner (not based on the PETSc
> > >> fieldsplit), and each filed is associated with one PC HYPRE.  If PETSc
> > >> duplicates communicators, we should easily reach the limit 2048.
> > >>
> > >> I also want to confirm what extra communicators are introduced in the
> bad
> > >> commit.
> > >
> > > To me it looks like there is 1 extra comm created [for MATHYPRE] for
> each PCHYPRE that is created [which also creates one comm for this object].
> > >
> >
> > You’re right; however, it was the same before the commit.
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__
> bitbucket.org_petsc_petsc_commits_49a781f5cee36db85e8d5b951eec29
> f10ac13593=DwIDaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmi
> CY=6_ukwovpDrK5BL_94S4ezasw2a3S15SM59R41rSY-Yw=
> r8xHYLKF9LtJHReR6Jmfeei3OfwkQNiGrKXAgeqPVQ8=
> Before the commit - PCHYPRE was not calling MatConvert(MATHYPRE) [this
> results in an additional call to MPI_Comm_dup() for hypre calls] PCHYPRE
> was calling MatHYPRE_IJMatrixCreate() directly [which I presume reusing the
> comm created by the call to MPI_Comm_dup() in PCHYPRE - for hypre calls]
>
>
>
> > I don’t understand how this specific commit is related with this issue,
> being the error not in the MPI_Comm_Dup which is inside MatCreate_MATHYPRE.
> Actually, the error comes from MPI_Comm_create
> >
> > frame #5: 0x0001068defd4 libmpi.12.dylib`MPI_Comm_create + 3492
> > frame #6: 0x0001061345d9 libpetsc.3.07.dylib`hypre_
> GenerateSubComm(comm=-1006627852, participate=,
> new_comm_ptr=) + 409 at gen_redcs_mat.c:531 [opt]
> > frame #7: 0x00010618f8ba libpetsc.3.07.dylib`hypre_
> GaussElimSetup(amg_data=0x7fe7ff857a00, level=,
> relax_type=9) + 74 at par_relax.c:4209 [opt]
> > frame #8: 0x000106140e93 libpetsc.3.07.dylib`hypre_
> BoomerAMGSetup(amg_vdata=, A=0x7fe80842aff0,
> f=0x7fe80842a980, u=0x7fe80842a510) + 17699 at par_amg_setup.c:2108
> [opt]
> > frame #9: 0x000105ec773c 
> > libpetsc.3.07.dylib`PCSetUp_HYPRE(pc=)
> + 2540 at hypre.c:226 [opt
>
> I thought this trace comes up after applying your patch
>

This trace comes from Mac



>
> -ierr = MatDestroy(>hpmat);CHKERRQ(ierr);
> -ierr = MatConvert(pc->pmat,MATHYPRE,MAT_INITIAL_MATRIX,>
> hpmat);CHKERRQ(ierr);
> +ierr = MatConvert(pc->pmat,MATHYPRE,jac->hpmat ? MAT_REUSE_MATRIX :
> MAT_INITIAL_MATRIX,>hpmat);CHKERRQ(ierr);
>
> The stack before this patch was: [its a different format - so it was
> obtained in a different way than the above method?]
>
> preconditioners/pbp.lots_of_variables: Other MPI error, error stack:
> preconditioners/pbp.lots_of_variables: PMPI_Comm_dup(177)..:
> MPI_Comm_dup(comm=0x8401, new_comm=0x97d1068) failed
> preconditioners/pbp.lots_of_variables: PMPI_Comm_dup(162)
> ..:
> preconditioners/pbp.lots_of_variables: MPIR_Comm_dup_impl(57)
> ..:
> preconditioners/pbp.lots_of_variables: MPIR_Comm_copy(739)...
> ..:
> preconditioners/pbp.lots_of_variables: MPIR_Get_contextid_sparse_group(614):
> Too many communicators (0/2048 free on this process; ignore_id=0)
>

This comes from a Linux (it is a test box), and I do not have access to it.


Fande,



>
> Satish
>
> >
> > How did you perform the bisection? make clean + make all ? Which version
> of HYPRE are you using?
> >
> > > But you might want to verify [by linking with mpi trace library?]
> > >
> > >
> > > There are some debugging hints at https://urldefense.proofpoint.
> com/v2/url?u=https-3A__lists.mpich.org_pipermail_discuss_
> 2012-2DDecember_

Re: [petsc-users] A bad commit affects MOOSE

2018-04-03 Thread Kong, Fande

t;fried...@gmail.com> wrote:
> >>>
> >>> I’m working with Fande on this and I would like to add a bit more.
> There are many circumstances where we aren’t working on COMM_WORLD at all
> (e.g. working on a sub-communicator) but PETSc was initialized using
> MPI_COMM_WORLD (think multi-level solves)… and we need to create
> arbitrarily many PETSc vecs/mats/solvers/preconditioners and solve.  We
> definitely can’t rely on using PETSC_COMM_WORLD to avoid triggering
> duplication.
> >>>
> >>> Can you explain why PETSc needs to duplicate the communicator so much?
> >>>
> >>> Thanks for your help in tracking this down!
> >>>
> >>> Derek
> >>>
> >>> On Mon, Apr 2, 2018 at 5:44 PM Kong, Fande <fande.k...@inl.gov> wrote:
> >>> Why we do not use user-level MPI communicators directly? What are
> potential risks here?
> >>>
> >>>
> >>> Fande,
> >>>
> >>> On Mon, Apr 2, 2018 at 5:08 PM, Satish Balay <ba...@mcs.anl.gov>
> wrote:
> >>> PETSC_COMM_WORLD [via PetscCommDuplicate()] attempts to minimize calls
> to MPI_Comm_dup() - thus potentially avoiding such errors
> >>>
> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.
> anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Sys_
> PetscCommDuplicate.html=DwIBAg=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmi
> CY=jgv7gpZ3K52d_FWMgkK9yEScbLA7pkrWydFuJnYflsU=_
> zpWRcyk3kHuEHoq02NDqYExnXIohXpNnjyabYnnDjU=
> >>>
> >>>
> >>> Satish
> >>>
> >>> On Mon, 2 Apr 2018, Kong, Fande wrote:
> >>>
> >>>> On Mon, Apr 2, 2018 at 4:23 PM, Satish Balay <ba...@mcs.anl.gov>
> wrote:
> >>>>
> >>>>> Does this 'standard test' use MPI_COMM_WORLD' to crate PETSc objects?
> >>>>>
> >>>>> If so - you could try changing to PETSC_COMM_WORLD
> >>>>>
> >>>>
> >>>>
> >>>> I do not think we are using PETSC_COMM_WORLD when creating PETSc
> objects.
> >>>> Why we can not use MPI_COMM_WORLD?
> >>>>
> >>>>
> >>>> Fande,
> >>>>
> >>>>
> >>>>>
> >>>>> Satish
> >>>>>
> >>>>>
> >>>>> On Mon, 2 Apr 2018, Kong, Fande wrote:
> >>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> I am trying to upgrade PETSc from 3.7.6 to 3.8.3 for MOOSE and its
> >>>>>> applications. I have a error message for a standard test:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> *preconditioners/pbp.lots_of_variables: MPI had an
> >>>>>> errorpreconditioners/pbp.lots_of_variables:
> >>>>>> 
> >>>>> preconditioners/pbp.lots_of_variables:
> >>>>>> Other MPI error, error stack:preconditioners/pbp.lots_of_variables:
> >>>>>> PMPI_Comm_dup(177)..: MPI_Comm_dup(comm=0x8401,
> >>>>>> new_comm=0x97d1068) failedpreconditioners/pbp.lots_of_variables:
> >>>>>> PMPI_Comm_dup(162)..:
> >>>>>> preconditioners/pbp.lots_of_variables:
> >>>>>> MPIR_Comm_dup_impl(57)..:
> >>>>>> preconditioners/pbp.lots_of_variables:
> >>>>>> MPIR_Comm_copy(739).:
> >>>>>> preconditioners/pbp.lots_of_variables:
> >>>>>> MPIR_Get_contextid_sparse_group(614): Too many communicators
> (0/2048
> >>>>> free
> >>>>>> on this process; ignore_id=0)*
> >>>>>>
> >>>>>>
> >>>>>> I did "git bisect', and the following commit introduces this issue:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> *commit 49a781f5cee36db85e8d5b951eec29f10ac13593Author: Stefano
> Zampini
> >>>>>> <stefano.zamp...@gmail.com <stefano.zamp...@gmail.com>>Date:   Sat
> Nov 5
> >>>>>> 20:15:19 2016 +0300PCHYPRE: use internal Mat of type MatHYPRE
> >>>>>> hpmat already stores two HYPRE vectors*
> >>>>>>
> >>>>>> Before I debug line-by-line, anyone has a clue on this?
> >>>>>>
> >>>>>>
> >>>>>> Fande,
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-02 Thread Kong, Fande

Why we do not use user-level MPI communicators directly? What are potential
risks here?


Fande,

On Mon, Apr 2, 2018 at 5:08 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> PETSC_COMM_WORLD [via PetscCommDuplicate()] attempts to minimize calls to
> MPI_Comm_dup() - thus potentially avoiding such errors
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.
> anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Sys_
> PetscCommDuplicate.html=DwIBAg=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmi
> CY=jgv7gpZ3K52d_FWMgkK9yEScbLA7pkrWydFuJnYflsU=_
> zpWRcyk3kHuEHoq02NDqYExnXIohXpNnjyabYnnDjU=
>
> Satish
>
> On Mon, 2 Apr 2018, Kong, Fande wrote:
>
> > On Mon, Apr 2, 2018 at 4:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> >
> > > Does this 'standard test' use MPI_COMM_WORLD' to crate PETSc objects?
> > >
> > > If so - you could try changing to PETSC_COMM_WORLD
> > >
> >
> >
> > I do not think we are using PETSC_COMM_WORLD when creating PETSc objects.
> > Why we can not use MPI_COMM_WORLD?
> >
> >
> > Fande,
> >
> >
> > >
> > > Satish
> > >
> > >
> > > On Mon, 2 Apr 2018, Kong, Fande wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am trying to upgrade PETSc from 3.7.6 to 3.8.3 for MOOSE and its
> > > > applications. I have a error message for a standard test:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *preconditioners/pbp.lots_of_variables: MPI had an
> > > > errorpreconditioners/pbp.lots_of_variables:
> > > > 
> > > preconditioners/pbp.lots_of_variables:
> > > > Other MPI error, error stack:preconditioners/pbp.lots_of_variables:
> > > > PMPI_Comm_dup(177)..: MPI_Comm_dup(comm=0x8401,
> > > > new_comm=0x97d1068) failedpreconditioners/pbp.lots_of_variables:
> > > > PMPI_Comm_dup(162)..:
> > > > preconditioners/pbp.lots_of_variables:
> > > > MPIR_Comm_dup_impl(57)..:
> > > > preconditioners/pbp.lots_of_variables:
> > > > MPIR_Comm_copy(739).:
> > > > preconditioners/pbp.lots_of_variables:
> > > > MPIR_Get_contextid_sparse_group(614): Too many communicators (0/2048
> > > free
> > > > on this process; ignore_id=0)*
> > > >
> > > >
> > > > I did "git bisect', and the following commit introduces this issue:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *commit 49a781f5cee36db85e8d5b951eec29f10ac13593Author: Stefano
> Zampini
> > > > <stefano.zamp...@gmail.com <stefano.zamp...@gmail.com>>Date:   Sat
> Nov 5
> > > > 20:15:19 2016 +0300PCHYPRE: use internal Mat of type MatHYPRE
> > > > hpmat already stores two HYPRE vectors*
> > > >
> > > > Before I debug line-by-line, anyone has a clue on this?
> > > >
> > > >
> > > > Fande,
> > > >
> > >
> > >
> >
>
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-02 Thread Kong, Fande

On Mon, Apr 2, 2018 at 4:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> Does this 'standard test' use MPI_COMM_WORLD' to crate PETSc objects?
>
> If so - you could try changing to PETSC_COMM_WORLD
>


I do not think we are using PETSC_COMM_WORLD when creating PETSc objects.
Why we can not use MPI_COMM_WORLD?


Fande,


>
> Satish
>
>
> On Mon, 2 Apr 2018, Kong, Fande wrote:
>
> > Hi All,
> >
> > I am trying to upgrade PETSc from 3.7.6 to 3.8.3 for MOOSE and its
> > applications. I have a error message for a standard test:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *preconditioners/pbp.lots_of_variables: MPI had an
> > errorpreconditioners/pbp.lots_of_variables:
> > 
> preconditioners/pbp.lots_of_variables:
> > Other MPI error, error stack:preconditioners/pbp.lots_of_variables:
> > PMPI_Comm_dup(177)..: MPI_Comm_dup(comm=0x8401,
> > new_comm=0x97d1068) failedpreconditioners/pbp.lots_of_variables:
> > PMPI_Comm_dup(162)..:
> > preconditioners/pbp.lots_of_variables:
> > MPIR_Comm_dup_impl(57)..:
> > preconditioners/pbp.lots_of_variables:
> > MPIR_Comm_copy(739).:
> > preconditioners/pbp.lots_of_variables:
> > MPIR_Get_contextid_sparse_group(614): Too many communicators (0/2048
> free
> > on this process; ignore_id=0)*
> >
> >
> > I did "git bisect', and the following commit introduces this issue:
> >
> >
> >
> >
> >
> >
> >
> >
> > *commit 49a781f5cee36db85e8d5b951eec29f10ac13593Author: Stefano Zampini
> > <stefano.zamp...@gmail.com <stefano.zamp...@gmail.com>>Date:   Sat Nov 5
> > 20:15:19 2016 +0300PCHYPRE: use internal Mat of type MatHYPRE
> > hpmat already stores two HYPRE vectors*
> >
> > Before I debug line-by-line, anyone has a clue on this?
> >
> >
> > Fande,
> >
>
>

Re: [petsc-users] A bad commit affects MOOSE

2018-04-02 Thread Kong, Fande

Nope.

There is a back trace:




































** thread #1: tid = 0x3b477b4, 0x7fffb306cd42
libsystem_kernel.dylib`__pthread_kill + 10, queue =
'com.apple.main-thread', stop reason = signal SIGABRT  * frame #0:
0x7fffb306cd42 libsystem_kernel.dylib`__pthread_kill + 10frame #1:
0x7fffb315a457 libsystem_pthread.dylib`pthread_kill + 90frame #2:
0x7fffb2fd2420 libsystem_c.dylib`abort + 129frame #3:
0x0001057ff30a
libpetsc.3.07.dylib`Petsc_MPI_AbortOnError(comm=,
flag=) + 26 at init.c:185 [opt]frame #4:
0x000106bd3245 libpmpi.12.dylib`MPIR_Err_return_comm + 533frame #5:
0x0001068defd4 libmpi.12.dylib`MPI_Comm_create + 3492frame #6:
0x0001061345d9
libpetsc.3.07.dylib`hypre_GenerateSubComm(comm=-1006627852,
participate=, new_comm_ptr=) + 409 at
gen_redcs_mat.c:531 [opt]frame #7: 0x00010618f8ba
libpetsc.3.07.dylib`hypre_GaussElimSetup(amg_data=0x7fe7ff857a00,
level=, relax_type=9) + 74 at par_relax.c:4209 [opt]frame
#8: 0x000106140e93
libpetsc.3.07.dylib`hypre_BoomerAMGSetup(amg_vdata=,
A=0x7fe80842aff0, f=0x7fe80842a980, u=0x7fe80842a510) + 17699
at par_amg_setup.c:2108 [opt]frame #9: 0x000105ec773c
libpetsc.3.07.dylib`PCSetUp_HYPRE(pc=) + 2540 at hypre.c:226
[opt]frame #10: 0x000105eea68d
libpetsc.3.07.dylib`PCSetUp(pc=0x7fe805553f50) + 797 at precon.c:968
[opt]frame #11: 0x000105ee9fe5
libpetsc.3.07.dylib`PCApply(pc=0x7fe805553f50, x=0x7fe80052d420,
y=0x7fe800522c20) + 181 at precon.c:478 [opt]frame #12:
0x0001015cf218
libmesh_opt.0.dylib`libMesh::PetscPreconditioner::apply(libMesh::NumericVector
const&, libMesh::NumericVector&) + 24frame #13:
0x0001009c7998
libmoose-opt.0.dylib`PhysicsBasedPreconditioner::apply(libMesh::NumericVector
const&, libMesh::NumericVector&) + 520frame #14:
0x0001016ad701 libmesh_opt.0.dylib`libmesh_petsc_preconditioner_apply +
129frame #15: 0x000105e7e715
libpetsc.3.07.dylib`PCApply_Shell(pc=0x7fe8052623f0,
x=0x7fe806805a20, y=0x7fe806805420) + 117 at shellpc.c:123 [opt]
frame #16: 0x000105eea079
libpetsc.3.07.dylib`PCApply(pc=0x7fe8052623f0, x=0x7fe806805a20,
y=0x7fe806805420) + 329 at precon.c:482 [opt]frame #17:
0x000105eeb611 libpetsc.3.07.dylib`PCApplyBAorAB(pc=0x7fe8052623f0,
side=PC_RIGHT, x=0x7fe806805a20, y=0x7fe806806020,
work=0x7fe806805420) + 945 at precon.c:714 [opt]frame #18:
0x000105f31658 libpetsc.3.07.dylib`KSPGMRESCycle [inlined]
KSP_PCApplyBAorAB(ksp=0x7fe80600, x=,
y=0x7fe806806020, w=) + 191 at kspimpl.h:295 [opt]
frame #19: 0x000105f31599
libpetsc.3.07.dylib`KSPGMRESCycle(itcount=, ksp=)
+ 553 at gmres.c:156 [opt]frame #20: 0x000105f326bd
libpetsc.3.07.dylib`KSPSolve_GMRES(ksp=) + 221 at gmres.c:240
[opt]frame #21: 0x000105f5f671
libpetsc.3.07.dylib`KSPSolve(ksp=0x7fe80600, b=0x7fe7fd946220,
x=) + 1345 at itfunc.c:677 [opt]frame #22:
0x000105fd0251
libpetsc.3.07.dylib`SNESSolve_NEWTONLS(snes=) + 1425 at
ls.c:230 [opt]frame #23: 0x000105fa10ca
libpetsc.3.07.dylib`SNESSolve(snes=, b=,
x=0x7fe7fd865e20) + 858 at snes.c:4128 [opt]frame #24:
0x0001016b63c3
libmesh_opt.0.dylib`libMesh::PetscNonlinearSolver::solve(libMesh::SparseMatrix&,
libMesh::NumericVector&, libMesh::NumericVector&, double,
unsigned int) + 835frame #25: 0x0001016fc244
libmesh_opt.0.dylib`libMesh::NonlinearImplicitSystem::solve() + 324
frame #26: 0x000100a71dc8 libmoose-opt.0.dylib`NonlinearSystem::solve()
+ 472frame #27: 0x0001009fe815
libmoose-opt.0.dylib`FEProblemBase::solve() + 117frame #28:
0x000100761fba libmoose-opt.0.dylib`Steady::execute() + 266frame
#29: 0x000100b78ac3 libmoose-opt.0.dylib`MooseApp::run() + 259frame
#30: 0x0001003843aa moose_test-opt`main + 122frame #31:
0x7fffb2f3e235 libdyld.dylib`start + 1*
Fande,


On Mon, Apr 2, 2018 at 4:02 PM, Stefano Zampini <stefano.zamp...@gmail.com>
wrote:

> maybe this will fix ?
>
>
> *diff --git a/src/ksp/pc/impls/hypre/hypre.c
> b/src/ksp/pc/impls/hypre/hypre.c*
>
> *index 28addcf533..6a756d4c57 100644*
>
> *--- a/src/ksp/pc/impls/hypre/hypre.c*
>
> *+++ b/src/ksp/pc/impls/hypre/hypre.c*
>
> @@ -142,8 +142,7 @@ static PetscErrorCode PCSetUp_HYPRE(PC pc)
>
>
>
>ierr = PetscObjectTypeCompare((PetscObject)pc->pmat,MATHYPRE,
> );CHKERRQ(ierr);
>
>if (!ishypre) {
>
> -ierr = MatDestroy(>hpmat);CHKERRQ(ierr);
>
> -ierr = MatConvert(pc->pmat,MATHYPRE,MAT_INITIAL_MATRIX,>
> hpmat);CHKERRQ(ierr);
>
> +ierr = MatConvert(pc->pmat,MATHYPRE,jac->hpmat ? MAT_REUSE_MATRIX :
> MAT_INITIAL_MATRIX,>hpmat);CHKERRQ(ierr);
>
>} else {
>
>  ierr = PetscObjectReference((PetscObject)pc->pmat);CHKERRQ(ierr);
>
>  ie

[petsc-users] A bad commit affects MOOSE

2018-04-02 Thread Kong, Fande

Hi All,

I am trying to upgrade PETSc from 3.7.6 to 3.8.3 for MOOSE and its
applications. I have a error message for a standard test:









*preconditioners/pbp.lots_of_variables: MPI had an
errorpreconditioners/pbp.lots_of_variables:
preconditioners/pbp.lots_of_variables:
Other MPI error, error stack:preconditioners/pbp.lots_of_variables:
PMPI_Comm_dup(177)..: MPI_Comm_dup(comm=0x8401,
new_comm=0x97d1068) failedpreconditioners/pbp.lots_of_variables:
PMPI_Comm_dup(162)..:
preconditioners/pbp.lots_of_variables:
MPIR_Comm_dup_impl(57)..:
preconditioners/pbp.lots_of_variables:
MPIR_Comm_copy(739).:
preconditioners/pbp.lots_of_variables:
MPIR_Get_contextid_sparse_group(614): Too many communicators (0/2048 free
on this process; ignore_id=0)*


I did "git bisect', and the following commit introduces this issue:








*commit 49a781f5cee36db85e8d5b951eec29f10ac13593Author: Stefano Zampini
>Date:   Sat Nov 5
20:15:19 2016 +0300PCHYPRE: use internal Mat of type MatHYPRE
hpmat already stores two HYPRE vectors*

Before I debug line-by-line, anyone has a clue on this?


Fande,

[petsc-users] slepc-master does not configure correctly

2018-03-21 Thread Kong, Fande

Hi All,

~/projects/slepc]> PETSC_ARCH=arch-darwin-c-debug-master ./configure







*Checking environment...Traceback (most recent call last):  File
"./configure", line 10, in 
execfile(os.path.join(os.path.dirname(__file__), 'config',
'configure.py'))  File "./config/configure.py", line 206, in 
log.write('PETSc install directory: '+petsc.destdir)AttributeError: PETSc
instance has no attribute 'destdir'*



SLEPc may be needed to synchronized for new changes in PETSc.

Thanks,

Fande Kong

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-03-07 Thread Kong, Fande

On Wed, Mar 7, 2018 at 2:51 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> On Wed, 7 Mar 2018, Kong, Fande wrote:
>
> > > If you need to workarround this - you can comment out that test
> (3lines)..
> > >
> > >   File "/home/kongf/workhome/projects/petsc-3.7.7/config/
> > > BuildSystem/config/package.py", line 519, in updateGitDir
> > > gitdir,err,ret = config.base.Configure.executeShellCommand([self.
> sourceControl.git,
> > > 'rev-parse','--git-dir'], cwd=self.packageDir, log = self.log)
> > >
> > >
> >
> > "#self.updateGitDir()"  works.
>
> I meant just the 3 lines - not the whole function.
>

I knew this. "3 lines" does not work at all.

I forgot the error message.

Fande,


>
>
> diff --git a/config/BuildSystem/config/package.py
> b/config/BuildSystem/config/package.py
> index 85663247ce..439b2105c5 100644
> --- a/config/BuildSystem/config/package.py
> +++ b/config/BuildSystem/config/package.py
> @@ -516,9 +516,9 @@ class Package(config.base.Configure):
># verify that packageDir is actually a git clone
>if not os.path.isdir(os.path.join(self.packageDir,'.git')):
>  raise RuntimeError(self.packageDir +': is not a git repository!
> '+os.path.join(self.packageDir,'.git')+' not found!')
> -  gitdir,err,ret = 
> config.base.Configure.executeShellCommand([self.sourceControl.git,
> 'rev-parse','--git-dir'], cwd=self.packageDir, log = self.log)
> -  if gitdir != '.git':
> -raise RuntimeError(self.packageDir +': is not a git repository!
> "git rev-parse --gitdir" gives: '+gitdir)
> +  #gitdir,err,ret = 
> config.base.Configure.executeShellCommand([self.sourceControl.git,
> 'rev-parse','--git-dir'], cwd=self.packageDir, log = self.log)
> +  #if gitdir != '.git':
> +  #  raise RuntimeError(self.packageDir +': is not a git repository!
> "git rev-parse --gitdir" gives: '+gitdir)
>
>prefetch = 0
>if self.gitcommit.startswith('origin/'):
>
> Satish
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-03-07 Thread Kong, Fande

On Wed, Mar 7, 2018 at 1:22 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> Its strange that you are getting this error in configure - but not command
> linke.
>
> Does the following option make a difference?
>
> --useThreads=0
> or
> --useThreads=1
>

no difference.


>
> If you need to workarround this - you can comment out that test (3lines)..
>
>   File "/home/kongf/workhome/projects/petsc-3.7.7/config/
> BuildSystem/config/package.py", line 519, in updateGitDir
> gitdir,err,ret = 
> config.base.Configure.executeShellCommand([self.sourceControl.git,
> 'rev-parse','--git-dir'], cwd=self.packageDir, log = self.log)
>
>

"#self.updateGitDir()"  works.

I still do not understand why.


Fande,


>
> Satish
>
> On Wed, 7 Mar 2018, Kong, Fande wrote:
>
> > [kongf@falcon1 git.hypre]$ git rev-parse  --git-dir
> > .git
> > [kongf@falcon1 git.hypre]$ echo $?
> > 0
> > [kongf@falcon1 git.hypre]$
> > [kongf@falcon1 git.hypre]$ git fsck
> > Checking object directories: 100% (256/256), done.
> > Checking objects: 100% (22710/22710), done.
> > [kongf@falcon1 git.hypre]$
> >
> >
> >
> > But the same  error still persists!
> >
> >
> > Fande,
> >
> > On Wed, Mar 7, 2018 at 12:33 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> >
> > > >>>>>>>>
> > > balay@asterix /home/balay
> > > $ git rev-parse --git-dir
> > > fatal: Not a git repository (or any parent up to mount point /)
> > > Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not
> set).
> > > balay@asterix /home/balay
> > > $ echo $?
> > > 128
> > > balay@asterix /home/balay
> > > $ cd petsc
> > > balay@asterix /home/balay/petsc (next *=)
> > > $ git rev-parse --git-dir
> > > .git
> > > balay@asterix /home/balay/petsc (next *=)
> > > $ echo $?
> > > 0
> > > balay@asterix /home/balay/petsc (next *=)
> > > $
> > > <<<<<
> > >
> > > So - for some reason, git is an error in /home/kongf/workhome/projects/
> > > petsc-3.7.7/arch-linux2-c-opt-64bit/externalpackages/git.hypre
> > >
> > > You can try:
> > >
> > > cd /home/kongf/workhome/projects/petsc-3.7.7/arch-linux2-c-opt-
> > > 64bit/externalpackages/git.hypre
> > > git rev-parse --git-dir
> > > echo $?
> > > git fsck
> > >
> > > If this works - you can rerun configure and see if the error persists
> > >
> > > Satish
> > >
> > >
> > > On Wed, 7 Mar 2018, Kong, Fande wrote:
> > >
> > > > Hi PETSc team:
> > > >
> > > > What is the possible reason for this?
> > > >
> > > > The log file is attached.
> > > >
> > > >
> > > > Fande,
> > > >
> > >
> > >
> >
>
>

Re: [petsc-users] Could not execute "['git', 'rev-parse', '--git-dir']"

2018-03-07 Thread Kong, Fande

[kongf@falcon1 git.hypre]$ git rev-parse  --git-dir
.git
[kongf@falcon1 git.hypre]$ echo $?
0
[kongf@falcon1 git.hypre]$
[kongf@falcon1 git.hypre]$ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (22710/22710), done.
[kongf@falcon1 git.hypre]$



But the same  error still persists!


Fande,

On Wed, Mar 7, 2018 at 12:33 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> >>>>>>>>
> balay@asterix /home/balay
> $ git rev-parse --git-dir
> fatal: Not a git repository (or any parent up to mount point /)
> Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
> balay@asterix /home/balay
> $ echo $?
> 128
> balay@asterix /home/balay
> $ cd petsc
> balay@asterix /home/balay/petsc (next *=)
> $ git rev-parse --git-dir
> .git
> balay@asterix /home/balay/petsc (next *=)
> $ echo $?
> 0
> balay@asterix /home/balay/petsc (next *=)
> $
> <<<<<
>
> So - for some reason, git is an error in /home/kongf/workhome/projects/
> petsc-3.7.7/arch-linux2-c-opt-64bit/externalpackages/git.hypre
>
> You can try:
>
> cd /home/kongf/workhome/projects/petsc-3.7.7/arch-linux2-c-opt-
> 64bit/externalpackages/git.hypre
> git rev-parse --git-dir
> echo $?
> git fsck
>
> If this works - you can rerun configure and see if the error persists
>
> Satish
>
>
> On Wed, 7 Mar 2018, Kong, Fande wrote:
>
> > Hi PETSc team:
> >
> > What is the possible reason for this?
> >
> > The log file is attached.
> >
> >
> > Fande,
> >
>
>

Re: [petsc-users] with-openmp error with hypre

2018-02-13 Thread Kong, Fande

Curious about the comparison of 16x4 VS 64.

Fande,

On Tue, Feb 13, 2018 at 11:44 AM, Bakytzhan Kallemov 
wrote:

> Hi,
>
> I am not sure about 64 flat run,
>
> unfortunately I did not save logs since it's easy to run,  but for 16 -
> here is the plot I got for different number of threads for KSPSolve time
>
> Baky
>
> On 02/13/2018 10:28 AM, Matthew Knepley wrote:
>
> On Tue, Feb 13, 2018 at 11:30 AM, Smith, Barry F. 
> wrote:
>>
>> > On Feb 13, 2018, at 10:12 AM, Mark Adams  wrote:
>> >
>> > FYI, we were able to get hypre with threads working on KNL on Cori by
>> going down to -O1 optimization. We are getting about 2x speedup with 4
>> threads and 16 MPI processes per socket. Not bad.
>>
>>   In other works using 16 MPI processes with 4 threads per process is
>> twice as fast as running with 64 mpi processes?  Could you send the
>> -log_view output for these two cases?
>
>
> Is that what you mean? I took it to mean
>
>   We ran 16MPI processes and got time T.
>   We ran 16MPI processes with 4 threads each and got time T/2.
>
> I would likely eat my shirt if 16x4 was 2x faster than 64.
>
>   Matt
>
>
>>
>> >
>> > There error, flatlined or slightly diverging hypre solves, occurred
>> even in flat MPI runs with openmp=1.
>>
>>   But the answers are wrong as soon as you turn on OpenMP?
>>
>>Thanks
>>
>> Barry
>>
>>
>> >
>> > We are going to test the Haswell nodes next.
>> >
>> > On Thu, Jan 25, 2018 at 4:16 PM, Mark Adams  wrote:
>> > Baky (cc'ed) is getting a strange error on Cori/KNL at NERSC. Using
>> maint it runs fine with -with-openmp=0, it runs fine with -with-openmp=1
>> and gamg, but with hypre and -with-openmp=1, even running with flat MPI,
>> the solver seems flatline (see attached and notice that the residual starts
>> to creep after a few time steps).
>> >
>> > Maybe you can suggest a hypre test that I can run?
>> >
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>
>
>

Re: [petsc-users] How to generate manul.pdf from source files

2018-01-26 Thread Kong, Fande

I guess we install sowing automatically when we are on petsc/master

> cat configure.log  | grep "sowing"

*Bfort not found. Installing sowing for FortranStubsTEST
checkDependencies from
config.packages.sowing(/Users/kongf/projects/petsc/config/BuildSystem/config/package.py:719)TESTING:
checkDependencies from
config.packages.sowing(config/BuildSystem/config/package.py:719)TEST
configureLibrary from
config.packages.sowing(/Users/kongf/projects/petsc/config/BuildSystem/config/package.py:744)TESTING:
configureLibrary from
config.packages.sowing(config/BuildSystem/config/package.py:744)
Checking for a functional sowing  Looking for SOWING at
git.sowing, hg.sowing or a directory starting with
['petsc-pkg-sowing']  Found a copy of SOWING in
git.sowingRemoving sowing.petscconfTEST checkSharedLibrary from
config.packages.sowing(/Users/kongf/projects/petsc/config/BuildSystem/config/package.py:792)TESTING:
checkSharedLibrary from
config.packages.sowing(config/BuildSystem/config/package.py:792)sowing:*
Thanks

*,*
Fande

On Fri, Jan 26, 2018 at 4:56 PM, Matthew Knepley <knep...@gmail.com> wrote:

> On Sat, Jan 27, 2018 at 10:53 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>> Hi,
>>
>> I want to generate manul.pdf from source files on my own desktop.
>>
>> In directory:  ./src/docs/tex/manual
>>
>> make ALL  LOC=/Users/kongf/projects/petsc
>>
>> * make: *** No rule to make target
>> `/Users/kongf/projects/petsc/docs/manualpages/htmlmap', needed by
>> `listing_kspex1tmp.tex'.  Stop.*
>>
>> What else I need to install?
>>
>
> Do you have sowing?
>
>   Matt
>
>
>>
>> Fande
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.caam.rice.edu_-7Emk51_=DwMFaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=zXfxecLju5hawg2DhuHqya7s2fzzkAwImvAOj3UvoqE=rKKqNp27QtRzVTHE03_mMvZmF70ig8PdWRiRXNhmPzA=>
>

[petsc-users] How to generate manul.pdf from source files

2018-01-26 Thread Kong, Fande

Hi,

I want to generate manul.pdf from source files on my own desktop.

In directory:  ./src/docs/tex/manual

make ALL  LOC=/Users/kongf/projects/petsc

* make: *** No rule to make target
`/Users/kongf/projects/petsc/docs/manualpages/htmlmap', needed by
`listing_kspex1tmp.tex'.  Stop.*

What else I need to install?

Fande

Re: [petsc-users] Context behind SNESComputeFunction call

2018-01-26 Thread Kong, Fande

Hi Barry,

I made minor changes on src/snes/examples/tutorials/ex2.c to demonstrate
this issue.  Please see the attachment.

./ex2 -snes_monitor -ksp_monitor -snes_mf_operator 1




























*atol=1e-50, rtol=1e-08, stol=1e-08, maxit=50, maxf=1 FormFunction is
called   0 SNES Function norm 5.414682427127e+00 0 KSP Residual norm
9.559082033938e-01  FormFunction is called  FormFunction is called 1
KSP Residual norm 1.703870633386e-09  FormFunction is called  FormFunction
is called   1 SNES Function norm 2.952582481151e-01 0 KSP Residual norm
2.672054855433e-02  FormFunction is called  FormFunction is called 1
KSP Residual norm 1.519298012177e-10  FormFunction is called  FormFunction
is called   2 SNES Function norm 4.502289047587e-04 0 KSP Residual norm
4.722075651268e-05  FormFunction is called  FormFunction is called 1
KSP Residual norm 3.834927363659e-14  FormFunction is called  FormFunction
is called   3 SNES Function norm 1.390073376131e-09 number of SNES
iterations = 3Norm of error 1.49795e-10, Iterations 3*

"FormFunction" is called TWICE at "0 KSP".

If we comment out MatMFFDSetFunction:

*/* ierr =
MatMFFDSetFunction(Jacobian,FormFunction_MFFD,(void*)snes);CHKERRQ(ierr);
*/*


./ex2 -snes_monitor -ksp_monitor -snes_mf_operator 1
























*atol=1e-50, rtol=1e-08, stol=1e-08, maxit=50, maxf=1 FormFunction is
called   0 SNES Function norm 5.414682427127e+00 0 KSP Residual norm
9.559082033938e-01  FormFunction is called 1 KSP Residual norm
1.703870633386e-09  FormFunction is called  FormFunction is called   1 SNES
Function norm 2.952582481151e-01 0 KSP Residual norm 2.672054855433e-02
 FormFunction is called 1 KSP Residual norm 1.519298012177e-10
 FormFunction is called  FormFunction is called   2 SNES Function norm
4.502289047587e-04 0 KSP Residual norm 4.722075651268e-05  FormFunction
is called 1 KSP Residual norm 3.834927363659e-14  FormFunction is
called  FormFunction is called   3 SNES Function norm 1.390073376131e-09
number of SNES iterations = 3Norm of error 1.49795e-10, Iterations 3*

"FormFunction" is called ONCE at "0 KSP".

Hopefully, this example makes the point clear.


Fande,

On Fri, Jan 26, 2018 at 3:50 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>
>   So you are doing something non-standard? Are you not just using -snes_mf
> or -snes_mf_operator? Can you send me a sample code that has the extra
> function evaluations? Because if you run through regular usage with the
> debugger you will see there is no extra evaluation.
>
>Barry
>
>
> > On Jan 26, 2018, at 4:32 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Fri, Jan 26, 2018 at 3:10 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> >
> >
> > > On Jan 26, 2018, at 2:15 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > >
> > >
> > > On Mon, Jan 8, 2018 at 2:15 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> > >
> > >
> > > > On Jan 8, 2018, at 2:59 PM, Alexander Lindsay <
> alexlindsay...@gmail.com> wrote:
> > > >
> > > > Is there any elegant way to tell whether SNESComputeFunction is
> being called under different conceptual contexts?
> > > >
> > > > E.g. non-linear residual evaluation vs. Jacobian formation from
> finite differencing vs. Jacobian-vector products from finite differencing?
> > >
> > >   Under normal usage with the options database no.
> > >
> > > Hi Barry,
> > >
> > > How difficult to provide an API? Is it possible?
> > >
> > >
> > >
> > >   If you have some reason to know you could write three functions and
> provide them to SNESSetFunction(), MatMFFDSetFunction(), and
> MatFDColoringSetFunction(). Note that these functions have slightly
> different calling sequences but you can have all of them call the same
> underlying common function if you are only interested in, for example, how
> many times the function is used for each purpose.
> > >
> > > If we use this way for the Jacobian-free Newton, the function
> evaluation will be called twice at the first linear iteration because the
> computed residual vector at the nonlinear step  is not reused. Any way to
> reuse the function residual of the Newton step instead of recomputing a new
> residual at the first linear iteration?
> >
> >It does reuse the function evaluation. Why do you think it does not?
> If you look at MatMult_MFFD() you will see the lines of code
> >
> >   /* compute func(U) as base for differencing; only needed first time in
> and not when provided by user */
> >   if (c

Re: [petsc-users] Context behind SNESComputeFunction call

2018-01-26 Thread Kong, Fande

On Fri, Jan 26, 2018 at 3:10 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>
> > On Jan 26, 2018, at 2:15 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Mon, Jan 8, 2018 at 2:15 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> >
> >
> > > On Jan 8, 2018, at 2:59 PM, Alexander Lindsay <
> alexlindsay...@gmail.com> wrote:
> > >
> > > Is there any elegant way to tell whether SNESComputeFunction is being
> called under different conceptual contexts?
> > >
> > > E.g. non-linear residual evaluation vs. Jacobian formation from finite
> differencing vs. Jacobian-vector products from finite differencing?
> >
> >   Under normal usage with the options database no.
> >
> > Hi Barry,
> >
> > How difficult to provide an API? Is it possible?
> >
> >
> >
> >   If you have some reason to know you could write three functions and
> provide them to SNESSetFunction(), MatMFFDSetFunction(), and
> MatFDColoringSetFunction(). Note that these functions have slightly
> different calling sequences but you can have all of them call the same
> underlying common function if you are only interested in, for example, how
> many times the function is used for each purpose.
> >
> > If we use this way for the Jacobian-free Newton, the function evaluation
> will be called twice at the first linear iteration because the computed
> residual vector at the nonlinear step  is not reused. Any way to reuse the
> function residual of the Newton step instead of recomputing a new residual
> at the first linear iteration?
>
>It does reuse the function evaluation. Why do you think it does not? If
> you look at MatMult_MFFD() you will see the lines of code
>
>   /* compute func(U) as base for differencing; only needed first time in
> and not when provided by user */
>   if (ctx->ncurrenth == 1 && ctx->current_f_allocated) {
> ierr = (*ctx->func)(ctx->funcctx,U,F);CHKERRQ(ierr);
>   }
>
> since the if is satisfied it does not compute the function at the base
> location.  To double check I ran src/snes/examples/tutorials/ex19 with
> -snes_mf in the debugger and verified that the "extra" function evaluations
> are not done.
>

In MatAssemblyEnd_SNESMF,

  if (j->func == (PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction) {
ierr = SNESGetFunction(snes,,NULL,NULL);CHKERRQ(ierr);
ierr = MatMFFDSetBase_MFFD(J,u,f);CHKERRQ(ierr);
  } else {
/* f value known by SNES is not correct for other differencing function
*/
ierr = MatMFFDSetBase_MFFD(J,u,NULL);CHKERRQ(ierr);
  }


Will hit ierr = MatMFFDSetBase_MFFD(J,u,NULL);CHKERRQ(ierr), because SNES
and MAT have different function pointers.

In MatMFFDSetBase_MFFD(Mat J,Vec U,Vec F),

  if (F) {
if (ctx->current_f_allocated) {ierr =
VecDestroy(>current_f);CHKERRQ(ierr);}
ctx->current_f   = F;
ctx->current_f_allocated = PETSC_FALSE;
  } else if (!ctx->current_f_allocated) {
ierr = MatCreateVecs(J,NULL,>current_f);CHKERRQ(ierr);

ctx->current_f_allocated = PETSC_TRUE;
  }

Because F=NULL, we then have ctx->current_f_allocated = PETSC_TRUE.

Then, the following if statement is true:

  if (ctx->ncurrenth == 1 && ctx->current_f_allocated) {
ierr = (*ctx->func)(ctx->funcctx,U,F);CHKERRQ(ierr);
  }


Fande,



>
>   Barry
>
>
> >
> > Fande,
> >
> >
> >
> >Barry
> >
> >
> >
> > >
> > > Alex
> >
> >
>
>

Re: [petsc-users] segfault after recent scientific linux upgrade

2017-12-07 Thread Kong, Fande

On Thu, Dec 7, 2017 at 8:15 AM, Klaij, Christiaan  wrote:

> Satish,
>
>
>
> As a first try, I've kept petsc-3.7.5 and only replaced superlu
>
> by the new xsdk-0.2.0-rc1 version. Unfortunately, this doesn't
>
> fix the problem, see the backtrace below.
>
>
>
> Fande,
>
>
>
> Perhaps the problem is related to petsc, not superlu?
>
>
>
> What really puzzles me is that everything was working fine with
>
> petsc-3.7.5 and superlu_dist_5.3.1, it only broke after we
>
> updated Scientific Linux 7. So this bug (in petsc or in superlu)
>
> was already there but somehow not triggered before the SL7
>
> update?
>
>
>
> Chris
>
>
>
>
I do not know how you installed PETSc. It looks like you are keeping using
the old superlu_dist.  You have to delete the old package, and start from
the scratch. PETSc does not automatically clean the old one. For me, I just
simply "rm -rf $PETSC_ARCH" every time before I reinstall a new PETSc.


Fande,

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande

Thanks, Barry,

On Wed, Nov 15, 2017 at 4:04 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>   Do the ASM runs for thousands of time-steps produce the same final
> "physical results" as the MUMPS run for thousands of timesteps?  While with
> SuperLU you get a very different "physical results"?
>

Let me update a little bit more. The simulation with SuperLU may fail at
certain time step. Sometime we can also run the simulation  successfully
for the whole time range.  It is totally random.

We will try ASM and MUMPS.

Fande,




>
>   Barry
>
>
> > On Nov 15, 2017, at 4:52 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> >
> >   Since the convergence labeled linear does not converge to 14 digits in
> one iteration I am assuming you are using lagged preconditioning and or
> lagged  Jacobian?
> >
> > We are using Jacobian-free Newton. So Jacobian is different from the
> preconditioning matrix.
> >
> >
> >What happens if you do no lagging and solve each linear solve with a
> new LU factorization?
> >
> > We have the following results without using Jacobian-free Newton. Again,
> superlu_dist produces differences, while MUMPS gives the same results in
> terms of the residual norms.
> >
> >
> > Fande,
> >
> >
> > Superlu_dist run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.322285e-11
> >  1 Nonlinear |R| = 1.666987e-11
> >
> >
> > Superlu_dist run2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.322171e-11
> >  1 Nonlinear |R| = 1.666977e-11
> >
> >
> > Superlu_dist run3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.321964e-11
> >  1 Nonlinear |R| = 1.666959e-11
> >
> >
> > Superlu_dist run4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.321978e-11
> >  1 Nonlinear |R| = 1.668688e-11
> >
> >
> > MUMPS run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >   Barry
> >
> >
> > > On Nov 15, 2017, at 4:24 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > >
> > >
> > > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> > >
> > >
> > > > On Nov 15, 2017, at 3:36 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Hi Barry,
> > > >
> > > > Thanks for your reply. I was wondering why this happens only when we
> use superlu_dist. I am trying to understand the algorithm in superlu_dist.
> If we use ASM or MUMPS, we do not produce these differences.
> > > >
> > > > The differences actually are NOT meaningless.  In fact, we have a
> real transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
> > >
> > >   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> > > >
> > > > This makes the solution not reprod

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande

On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>   Since the convergence labeled linear does not converge to 14 digits in
> one iteration I am assuming you are using lagged preconditioning and or
> lagged  Jacobian?
>

We are using Jacobian-free Newton. So Jacobian is different from the
preconditioning matrix.


>
>What happens if you do no lagging and solve each linear solve with a
> new LU factorization?
>

We have the following results without using Jacobian-free Newton. Again,
superlu_dist produces differences, while MUMPS gives the same results in
terms of the residual norms.


Fande,


Superlu_dist run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.322285e-11
 1 Nonlinear |R| = 1.666987e-11


Superlu_dist run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.322171e-11
 1 Nonlinear |R| = 1.666977e-11


Superlu_dist run3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.321964e-11
 1 Nonlinear |R| = 1.666959e-11


Superlu_dist run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.321978e-11
 1 Nonlinear |R| = 1.668688e-11


MUMPS run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run 2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run 3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11









>
>   Barry
>
>
> > On Nov 15, 2017, at 4:24 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
> >
> >
> > > On Nov 15, 2017, at 3:36 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > > Hi Barry,
> > >
> > > Thanks for your reply. I was wondering why this happens only when we
> use superlu_dist. I am trying to understand the algorithm in superlu_dist.
> If we use ASM or MUMPS, we do not produce these differences.
> > >
> > > The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
> >
> >   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> > >
> > > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
> >
> >   This is possible but it is also possible this is due to normal round
> off inside of SuperLU dist.
> >
> >Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration
> does essential defect correction for you; are you making sure that the
> nonlinear iteration always works for every timestep? For example confirm
> that SNESGetConvergedReason() is always positive.
> >
> > Definitely it could be something wrong on my side.  But let us focus on
> the simple question first.
> >
> > To make the discussion a little simpler, let us back to the simple
> problem (heat conduction).   Now I want to understand why this happens to
> superlu_dist only. When we are using ASM or MUMPS,  why we can not see the
> differences from one run to another?  I posted the residual histories for
> MUMPS and ASM.  We can not see any differences in terms of the residual
> norms when using MUMPS or ASM. Does superlu_dist have higher round off than
> other solvers?
> >
> >
> >
> > MUMPS run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande

On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>
> > On Nov 15, 2017, at 3:36 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Hi Barry,
> >
> > Thanks for your reply. I was wondering why this happens only when we use
> superlu_dist. I am trying to understand the algorithm in superlu_dist. If
> we use ASM or MUMPS, we do not produce these differences.
> >
> > The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
>
>   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
>
>
> > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
>
>   This is possible but it is also possible this is due to normal round off
> inside of SuperLU dist.
>
>Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration
> does essential defect correction for you; are you making sure that the
> nonlinear iteration always works for every timestep? For example confirm
> that SNESGetConvergedReason() is always positive.
>

Definitely it could be something wrong on my side.  But let us focus on the
simple question first.

To make the discussion a little simpler, let us back to the simple problem
(heat conduction).   Now I want to understand why this happens to
superlu_dist only. When we are using ASM or MUMPS,  why we can not see the
differences from one run to another?  I posted the residual histories for
MUMPS and ASM.  We can not see any differences in terms of the residual
norms when using MUMPS or ASM. Does superlu_dist have higher round off than
other solvers?



MUMPS run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08



ASM run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 6.189229e+03
  2 Linear |R| = 3.252487e+02
  3 Linear |R| = 3.485174e+01
  4 Linear |R| = 8.600695e+00
  5 Linear |R| = 3.333942e+00
  6 Linear |R| = 1.706112e+00
  7 Linear |R| = 5.047863e-01
  8 Linear |R| = 2.337297e-01
  9 Linear |R| = 1.071627e-01
 10 Linear |R| = 4.692177e-02
 11 Linear |R| = 1.340717e-02
 12 Linear |R| = 4.753951e-03
 1 Nonlinear |R| = 2.320271e-02
  0 Linear |R| = 2.320271e-02
  1 Linear |R| = 4.367880e-03
  2 Linear |R| = 1.407852e-03
  3 Linear |R| = 6.036360e-04
  4 Linear |R| = 1.867661e-04
  5 Linear |R| = 8.760076e-05
  6 Linear |R| = 3.260519e-05
  7 Linear |R| = 1.435418e-05
  8 Linear |R| = 4.532875e-06
  9 Linear |R| = 2.439053e-06
 10 Linear |R| = 7.998549e-07
 11 Linear |R| = 2.428064e-07
 12 Linear |R| = 4.766918e-08
 13 Linear |R| = 1.713748e-08
 2 Nonlinear |R| = 3.671573e-07


ASM run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 6.189229e+03
  2 Linear |R

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande

Hi Barry,

Thanks for your reply. I was wondering why this happens only when we use
superlu_dist. I am trying to understand the algorithm in superlu_dist. If
we use ASM or MUMPS, we do not produce these differences.

The differences actually are NOT meaningless.  In fact, we have a real
transient application that presents this issue.   When we run the
simulation with superlu_dist in parallel for thousands of time steps, the
final physics  solution looks totally different from different runs. The
differences are not acceptable any more.  For a steady problem, the
difference may be meaningless. But it is significant for the transient
problem.

This makes the solution not reproducible, and we can not even set a
targeting solution in the test system because the solution is so different
from one run to another.   I guess there might/may be a tiny bug in
superlu_dist or the PETSc interface to superlu_dist.


Fande,




On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>   Meaningless differences
>
>
> > On Nov 15, 2017, at 2:26 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Hi,
> >
> > There is a heat conduction problem. When superlu_dist is used as a
> preconditioner, we have random results from different runs. Is there a
> random algorithm in superlu_dist? If we use ASM or MUMPS as the
> preconditioner, we then don't have this issue.
> >
> > run 1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104757e-08
> >   2 Linear |R| = 7.699637e-14
> >  2 Nonlinear |R| = 5.106418e-08
> >
> >
> > run 2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.109913e-08
> >   2 Linear |R| = 7.189091e-14
> >  2 Nonlinear |R| = 5.111591e-08
> >
> > run 3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104942e-08
> >   2 Linear |R| = 7.465572e-14
> >  2 Nonlinear |R| = 5.106642e-08
> >
> > run 4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.102730e-08
> >   2 Linear |R| = 7.132220e-14
> >  2 Nonlinear |R| = 5.104442e-08
> >
> > Solver details:
> >
> > SNES Object: 8 MPI processes
> >   type: newtonls
> >   maximum iterations=15, maximum function evaluations=1
> >   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
> >   total number of linear solver iterations=4
> >   total number of function evaluations=7
> >   norm schedule ALWAYS
> >   SNESLineSearch Object: 8 MPI processes
> > type: basic
> > maxstep=1.00e+08, minlambda=1.00e-12
> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > maximum iterations=40
> >   KSP Object: 8 MPI processes
> > type: gmres
> >   restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >   happy breakdown tolerance 1e-30
> > maximum iterations=100, initial guess is zero
> > tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> >   PC Object: 8 MPI processes
> > type: lu
> >   out-of-place factorization
> >   tolerance for zero pivot 2.22045e-14
> >   matrix ordering: natural
> >   factor fill ratio given 0., needed 0.
> > Factored matrix follows:
> >   Mat Object: 8 MPI processes
> > type: superlu_dist
> > rows=7925, cols=7925
> > package used to perform factorization: superlu_dist
> > total: nonzeros=0, allocated nonzeros=0
> > total number of mallocs used during MatSetValues calls =0
> >   SuperLU_DIST run par

[petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande

Hi,

There is a heat conduction problem. When superlu_dist is used as a
preconditioner, we have random results from different runs. Is there a
random algorithm in superlu_dist? If we use ASM or MUMPS as the
preconditioner, we then don't have this issue.

run 1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.104757e-08
  2 Linear |R| = 7.699637e-14
 2 Nonlinear |R| = 5.106418e-08


run 2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.109913e-08
  2 Linear |R| = 7.189091e-14
 2 Nonlinear |R| = 5.111591e-08

run 3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.104942e-08
  2 Linear |R| = 7.465572e-14
 2 Nonlinear |R| = 5.106642e-08

run 4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.102730e-08
  2 Linear |R| = 7.132220e-14
 2 Nonlinear |R| = 5.104442e-08

Solver details:

SNES Object: 8 MPI processes
  type: newtonls
  maximum iterations=15, maximum function evaluations=1
  tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
  total number of linear solver iterations=4
  total number of function evaluations=7
  norm schedule ALWAYS
  SNESLineSearch Object: 8 MPI processes
type: basic
maxstep=1.00e+08, minlambda=1.00e-12
tolerances: relative=1.00e-08, absolute=1.00e-15,
lambda=1.00e-08
maximum iterations=40
  KSP Object: 8 MPI processes
type: gmres
  restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
  happy breakdown tolerance 1e-30
maximum iterations=100, initial guess is zero
tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
  PC Object: 8 MPI processes
type: lu
  out-of-place factorization
  tolerance for zero pivot 2.22045e-14
  matrix ordering: natural
  factor fill ratio given 0., needed 0.
Factored matrix follows:
  Mat Object: 8 MPI processes
type: superlu_dist
rows=7925, cols=7925
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
  SuperLU_DIST run parameters:
Process grid nprow 4 x npcol 2
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots FALSE
Use iterative refinement TRUE
Processors in row 4 col partition 2
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern
linear system matrix followed by preconditioner matrix:
Mat Object: 8 MPI processes
  type: mffd
  rows=7925, cols=7925
Matrix-free approximation:
  err=1.49012e-08 (relative error in function evaluation)
  Using wp compute h routine
  Does not compute normU
Mat Object: () 8 MPI processes
  type: mpiaij
  rows=7925, cols=7925
  total: nonzeros=63587, allocated nonzeros=63865
  total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines


Fande,

Re: [petsc-users] configuration error

2017-10-30 Thread Kong, Fande

We had exactly the same issue when upgraded compilers.  I guess this is
somehow related to gfortran.  A simple way to work around for us is to
change* if with_rpath*: to * if False *at line 54 of
config/BuildSystem/config/libraries.py.

Not sure if it works for you.

Fande,




On Mon, Oct 30, 2017 at 10:14 AM, Manav Bhatia 
wrote:

> Hi,
>
>   I am trying to install pets 3.8 on a new MacBook machine with OS 10.13.
> I have installed openmpi from macports and I am getting this error on
> configuration. Attached is also the configure.log file.
>
>   I am not sure how to proceed with this. Any advice will be greatly
> appreciated!
>
> Regards,
> Manav
>
> 
> ===
>  Configuring PETSc to compile on your system
>
> 
> ===
> ===
>
>
>   * WARNING: Using default optimization C flags -g -O3
>
>
> You might consider manually setting optimal optimization flags for your
> system with
>
>   COPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for
> examples
>
>   
> ===
>
>
> ===
>
>
>   * WARNING: Using default C++ optimization flags -g -O3
>
>
> You might consider manually setting optimal optimization flags for your
> system with
>
>   CXXOPTFLAGS="optimization flags" see config/examples/arch-*-opt.py
> for examples
>
> 
> ===
>
>
> ===
>
>
>   * WARNING: Using default FORTRAN optimization flags -g -O
>
>
> You might consider manually setting optimal optimization flags for your
> system with
>
>   FOPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for
> examples
>
>   
> ===
>
>
> ===
>
>
> WARNING! Compiling PETSc with no debugging, this should
>
>
> only be done for timing and production runs. All
> development should
>
> be done when configured using
> --with-debugging=1
>
>   ==
> =
>
>   TESTING: checkCLibraries
> from config.compilers(config/BuildSystem/config/compilers.py:171)
>
>
> 
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for
> details):
> 
> ---
> C libraries cannot directly be used from Fortran
> 
> ***
>
>
>
>
>
>

[petsc-users] "Must select a target sorting criterion if using shift-and-invert"

2017-10-20 Thread Kong, Fande

Hi All,

I am trying to solve a generalized eigenvalue problem (using SLEPc) with
"-eps_type krylovschur -st_type sinvert". I got an error message: "Must
select a target sorting criterion if using shift-and-invert".

Not sure how to proceed.  I do not quite understand this sentence.

Fande,

Re: [petsc-users] Can not configure PETSc-master with clang-3.9

2017-10-16 Thread Kong, Fande

On Mon, Oct 16, 2017 at 12:07 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> BTW: Which clang are you using?
>
> mpicc -show
>



mpicc -show

clang -Wl,-commons,use_dylibs
-I/opt/moose/mpich/mpich-3.2/clang-opt/include
-L/opt/moose/mpich/mpich-3.2/clang-opt/lib -lmpi -lpmpi



> mpicc --version
>

mpicc -v

mpicc for MPICH version 3.2
clang version 3.9.0 (tags/RELEASE_390/final)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /opt/moose/llvm-3.9.0/bin
clang-3.9: warning: argument unused during compilation: '-I
/opt/moose/mpich/mpich-3.2/clang-opt/include'


I guess because we are using a customize installation of clang.


Fande,


>
> Satish
>
> On Mon, 16 Oct 2017, Satish Balay wrote:
>
> > Thats weird.
> >
> > From what I can recall - some tools (like pgi compilers) need this -
> > but the xcode compilers do not.
> >
> > Basically xcode clang can pick up includes from the xcode specific
> > location - but other tools look for includes in /usr/incldue
> >
> > And 'xcode-select --install' adds the /usr/include etc links.
> >
> > Satish
> >
> >
> > On Mon, 16 Oct 2017, Kong, Fande wrote:
> >
> > > Now it is working. It turns out I need to do something like
> "xcode-select
> > > --install" after upgrading OS, and of course we need to agree the
> license.
> > >
> > >
> > > Fande,
> > >
> > > On Mon, Oct 16, 2017 at 10:58 AM, Richard Tran Mills <rtmi...@anl.gov>
> > > wrote:
> > >
> > > > Fande,
> > > >
> > > > Did you remember to agree to the XCode license after your upgrade,
> if you
> > > > did an XCode upgrade? You have to do the license agreement again,
> otherwise
> > > > the compilers don't work at all. Apologies if this seems like a
> silly thing
> > > > to ask, but this has caused me a few minutes of confusion before.
> > > >
> > > > --Richard
> > > >
> > > > On Mon, Oct 16, 2017 at 9:52 AM, Jed Brown <j...@jedbrown.org> wrote:
> > > >
> > > >> "Kong, Fande" <fande.k...@inl.gov> writes:
> > > >>
> > > >> > Hi All,
> > > >> >
> > > >> > I just upgraded  MAC OS, and also updated all other related
> packages.
> > > >> Now
> > > >> > I can not configure PETSc-master any more.
> > > >>
> > > >> Your compiler paths are broken.
> > > >>
> > > >> /var/folders/6q/y12qpzw12dg5qx5x96dd5_bhtzr4_y/T/petsc-
> > > >> mFgio7/config.setCompilers/conftest.c:3:10: fatal error:
> 'stdlib.h' file
> > > >> not found
> > > >> #include 
> > > >>  ^
> > > >> 1 error generated.
> > > >>
> > > >
> > > >
> > >
> >
> >
>
>

Re: [petsc-users] Can not configure PETSc-master with clang-3.9

2017-10-16 Thread Kong, Fande

Now it is working. It turns out I need to do something like "xcode-select
--install" after upgrading OS, and of course we need to agree the license.


Fande,

On Mon, Oct 16, 2017 at 10:58 AM, Richard Tran Mills <rtmi...@anl.gov>
wrote:

> Fande,
>
> Did you remember to agree to the XCode license after your upgrade, if you
> did an XCode upgrade? You have to do the license agreement again, otherwise
> the compilers don't work at all. Apologies if this seems like a silly thing
> to ask, but this has caused me a few minutes of confusion before.
>
> --Richard
>
> On Mon, Oct 16, 2017 at 9:52 AM, Jed Brown <j...@jedbrown.org> wrote:
>
>> "Kong, Fande" <fande.k...@inl.gov> writes:
>>
>> > Hi All,
>> >
>> > I just upgraded  MAC OS, and also updated all other related packages.
>> Now
>> > I can not configure PETSc-master any more.
>>
>> Your compiler paths are broken.
>>
>> /var/folders/6q/y12qpzw12dg5qx5x96dd5_bhtzr4_y/T/petsc-
>> mFgio7/config.setCompilers/conftest.c:3:10: fatal error: 'stdlib.h' file
>> not found
>> #include 
>>  ^
>> 1 error generated.
>>
>
>

Re: [petsc-users] Using higher order ILU preconditioner

2017-09-28 Thread Kong, Fande

Calling PCFactorSetMatOrderingType() (or command line option:
-pc_factor_mat_ordering_type) to use RCM or 1WD usually helps me a lot. My
application is based on highly-unstructured meshes.

Fande,

On Thu, Sep 28, 2017 at 2:53 PM, Rachit Prasad  wrote:

> Hi,
>
> I am trying to solve a highly ill-conditioned matrix and am unable to get
> convergence when using ILU(0) or Jacobi and SOR preconditioners. I tried to
> implement a higher order ILU by increasing the level of fill-in to 1 by
> calling
>
> call PCFactorSetLevels(pc,1,ierr)
>
> However, when I do that I get the following memory error:
>
> [0]PETSC ERROR: - Error Message
> 
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated -2147483648 Memory used by process
> -2147483648
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 18446744071863943168!
>
> While I do understand that increasing the level of fill-in will require
> higher memory, the memory requested seems to be way too high. Am I
> implementing the higher order ILU correctly? Is there any other subroutine
> which I need to call?
>
> Regards,
> Rachit
>
>

Re: [petsc-users] SNES ex12 visualization

2017-09-14 Thread Kong, Fande

On Thu, Sep 14, 2017 at 11:26 AM, Matthew Knepley <knep...@gmail.com> wrote:

> On Thu, Sep 14, 2017 at 1:07 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>>
>>
>> On Thu, Sep 14, 2017 at 10:35 AM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>>
>>>
>>> > On Sep 14, 2017, at 11:10 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>>> >
>>> >
>>> >
>>> > On Thu, Sep 14, 2017 at 9:47 AM, Matthew Knepley <knep...@gmail.com>
>>> wrote:
>>> > On Thu, Sep 14, 2017 at 11:43 AM, Adriano Côrtes <
>>> adrimacor...@gmail.com> wrote:
>>> > Dear Matthew,
>>> >
>>> > Thank you for your return. It worked, but this prompts another
>>> question. So why PetscViewer does not write both files (.h5 and .xmf)
>>> directly, instead of having to post-proc the .h5 file (in serial)?
>>> >
>>> > 1) Maintenance: Changing the Python is much easier than changing the C
>>> you would add to generate it
>>> >
>>> > 2) Performance: On big parallel system, writing files is expensive so
>>> I wanted to minimize what I had to do.
>>> >
>>> > 3) Robustness: Moving 1 file around is much easier than remembering 2.
>>> I just always regenerate the xdmf when needed.
>>> >
>>> > And what about big 3D simulations? PETSc always serialize the output
>>> of the distributed dmplex? Is there a way to output one .h5 per mesh
>>> partition?
>>> >
>>> > Given the way I/O is structured on big machines, we believe the
>>> multiple file route is a huge mistake. Also, all our measurements
>>> > say that sending some data on the network is not noticeable given the
>>> disk access costs.
>>> >
>>> > I have slightly different things here. We tried the serial output, it
>>> looks really slow for large-scale problems, and the first processor often
>>> runs out of memory because of gathering all data from other processor cores.
>>>
>>>   Where in PETSc is this?  What type of viewer? Is there an example that
>>> reproduces the problem? Even when we do not use MPI IO in PETSc we attempt
>>> to not "put the entire object on the first process" so memory should not be
>>> an issue. For example VecVew() should memory scale both with or without MPI
>>> IO
>>>
>>
>> We manually gather all data to the first processor core, and write it as
>> a single vtk file.
>>
>
> Of course I am not doing that. I reduce everything to an ISView or a
> VecView call. That way it uses MPI I/O if its turned on.
>

I meant Fande manually gathers  all data to the first processor core in his
in-house code.


>
>Matt
>
>
>>
>>>
>>> > The parallel IO runs smoothly and much faster than I excepted. We have
>>> done experiments with ten thousands  of cores for a problem with 1 billion
>>> of unknowns.
>>>
>>> Is this your own canned IO or something in PETSc?
>>>
>>
>> We implement the writer based on the ISView and VecView with HDF5 viewer
>>  in PETSc to output all data as a single HDF. ISView and VecView do the
>> magic job for me.
>>
>>
>>
>>>
>>> > I did not see any concern so far.
>>>
>>>Ten thousand files is possibly manageable but I question 2 million.
>>>
>>
>> Just one single HDF5 file.
>>
>> Fande,
>>
>>
>>>
>>> >
>>> >
>>> > Fande,
>>> >
>>> >
>>> >   Thanks,
>>> >
>>> > Matt
>>> >
>>> > Best regards,
>>> > Adriano.
>>> >
>>> >
>>> > 2017-09-14 12:00 GMT-03:00 Matthew Knepley <knep...@gmail.com>:
>>> > On Thu, Sep 14, 2017 at 10:30 AM, Adriano Côrtes <
>>> adrimacor...@gmail.com> wrote:
>>> > Dear all,
>>> >
>>> > I am running the SNES ex12  and I'm passing the options -dm_view
>>> hdf5:sol.h5 -vec_view hdf5:sol.h5::append to generate an output file. The
>>> .h5 file is generated, but I'm not being able to load it in Paraview
>>> (5.4.0-64bits). Paraview recognizes the file and offers severel options to
>>> read it, here is the complete list
>>> >
>>> > Chombo Files
>>> > GTC Files
>>> > M3DC1 Files
>>> > Multilevel 3D Plasma Files
>>> > PFLOTRAN Files
>>

Re: [petsc-users] SNES ex12 visualization

2017-09-14 Thread Kong, Fande

On Thu, Sep 14, 2017 at 10:35 AM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Sep 14, 2017, at 11:10 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Thu, Sep 14, 2017 at 9:47 AM, Matthew Knepley <knep...@gmail.com>
> wrote:
> > On Thu, Sep 14, 2017 at 11:43 AM, Adriano Côrtes <adrimacor...@gmail.com>
> wrote:
> > Dear Matthew,
> >
> > Thank you for your return. It worked, but this prompts another question.
> So why PetscViewer does not write both files (.h5 and .xmf) directly,
> instead of having to post-proc the .h5 file (in serial)?
> >
> > 1) Maintenance: Changing the Python is much easier than changing the C
> you would add to generate it
> >
> > 2) Performance: On big parallel system, writing files is expensive so I
> wanted to minimize what I had to do.
> >
> > 3) Robustness: Moving 1 file around is much easier than remembering 2. I
> just always regenerate the xdmf when needed.
> >
> > And what about big 3D simulations? PETSc always serialize the output of
> the distributed dmplex? Is there a way to output one .h5 per mesh partition?
> >
> > Given the way I/O is structured on big machines, we believe the multiple
> file route is a huge mistake. Also, all our measurements
> > say that sending some data on the network is not noticeable given the
> disk access costs.
> >
> > I have slightly different things here. We tried the serial output, it
> looks really slow for large-scale problems, and the first processor often
> runs out of memory because of gathering all data from other processor cores.
>
>   Where in PETSc is this?  What type of viewer? Is there an example that
> reproduces the problem? Even when we do not use MPI IO in PETSc we attempt
> to not "put the entire object on the first process" so memory should not be
> an issue. For example VecVew() should memory scale both with or without MPI
> IO
>

We manually gather all data to the first processor core, and write it as a
single vtk file.


>
>
> > The parallel IO runs smoothly and much faster than I excepted. We have
> done experiments with ten thousands  of cores for a problem with 1 billion
> of unknowns.
>
> Is this your own canned IO or something in PETSc?
>

We implement the writer based on the ISView and VecView with HDF5 viewer
 in PETSc to output all data as a single HDF. ISView and VecView do the
magic job for me.



>
> > I did not see any concern so far.
>
>Ten thousand files is possibly manageable but I question 2 million.
>

Just one single HDF5 file.

Fande,


>
> >
> >
> > Fande,
> >
> >
> >   Thanks,
> >
> > Matt
> >
> > Best regards,
> > Adriano.
> >
> >
> > 2017-09-14 12:00 GMT-03:00 Matthew Knepley <knep...@gmail.com>:
> > On Thu, Sep 14, 2017 at 10:30 AM, Adriano Côrtes <adrimacor...@gmail.com>
> wrote:
> > Dear all,
> >
> > I am running the SNES ex12  and I'm passing the options -dm_view
> hdf5:sol.h5 -vec_view hdf5:sol.h5::append to generate an output file. The
> .h5 file is generated, but I'm not being able to load it in Paraview
> (5.4.0-64bits). Paraview recognizes the file and offers severel options to
> read it, here is the complete list
> >
> > Chombo Files
> > GTC Files
> > M3DC1 Files
> > Multilevel 3D Plasma Files
> > PFLOTRAN Files
> > Pixie Files
> > Tetrad Files
> > UNIC Files
> > VizSchema Files
> >
> > The problem is none of the options above work :(
> > I'm using the configure option '-download-hdf5' and it installs hdf5
> version 1.8.18
> > Any hint of how to fix it and have the visualization working?
> >
> > Yes, Paraview does not directly read HDF5. It needs you to tell it what
> the data in the HDF5 file means. You do
> > this by creating a *.xdmf file, which is XML. We provide a tool
> >
> >   $PETSC_DIR/bin/petsc_gen_xdmf.py 
> >
> > which should automatically produce this file for you. Let us know if it
> does not work.
> >
> >   Thanks,
> >
> > Matt
> >
> >
> > Best regards,
> > Adriano.
> >
> > --
> > Adriano Côrtes
> > =
> > Campus Duque de Caxias and
> > High-performance Computing Center (NACAD/COPPE)
> > Federal University of Rio de Janeiro (UFRJ)
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Nor

Re: [petsc-users] SNES ex12 visualization

2017-09-14 Thread Kong, Fande

On Thu, Sep 14, 2017 at 9:47 AM, Matthew Knepley  wrote:

> On Thu, Sep 14, 2017 at 11:43 AM, Adriano Côrtes 
> wrote:
>
>> Dear Matthew,
>>
>> Thank you for your return. It worked, but this prompts another question.
>> So why PetscViewer does not write both files (.h5 and .xmf) directly,
>> instead of having to post-proc the .h5 file (in serial)?
>>
>
> 1) Maintenance: Changing the Python is much easier than changing the C you
> would add to generate it
>
> 2) Performance: On big parallel system, writing files is expensive so I
> wanted to minimize what I had to do.
>
> 3) Robustness: Moving 1 file around is much easier than remembering 2. I
> just always regenerate the xdmf when needed.
>
>
>> And what about big 3D simulations? PETSc always serialize the output of
>> the distributed dmplex? Is there a way to output one .h5 per mesh
>> partition?
>>
>
> Given the way I/O is structured on big machines, we believe the multiple
> file route is a huge mistake. Also, all our measurements
> say that sending some data on the network is not noticeable given the disk
> access costs.
>

I have slightly different things here. We tried the serial output, it looks
really slow for large-scale problems, and the first processor often runs
out of memory because of gathering all data from other processor cores. The
parallel IO runs smoothly and much faster than I excepted. We have done
experiments with ten thousands  of cores for a problem with 1 billion of
unknowns. I did not see any concern so far.


Fande,


>
>   Thanks,
>
> Matt
>
>
>> Best regards,
>> Adriano.
>>
>>
>> 2017-09-14 12:00 GMT-03:00 Matthew Knepley :
>>
>>> On Thu, Sep 14, 2017 at 10:30 AM, Adriano Côrtes >> > wrote:
>>>
 Dear all,

 I am running the SNES ex12  and I'm passing the options -dm_view
 hdf5:sol.h5 -vec_view hdf5:sol.h5::append to generate an output file. The
 .h5 file is generated, but I'm not being able to load it in Paraview
 (5.4.0-64bits). Paraview recognizes the file and offers severel options to
 read it, here is the complete list

 Chombo Files
 GTC Files
 M3DC1 Files
 Multilevel 3D Plasma Files
 PFLOTRAN Files
 Pixie Files
 Tetrad Files
 UNIC Files
 VizSchema Files

 The problem is none of the options above work :(
 I'm using the configure option '-download-hdf5' and it installs hdf5
 version 1.8.18
 Any hint of how to fix it and have the visualization working?

>>>
>>> Yes, Paraview does not directly read HDF5. It needs you to tell it what
>>> the data in the HDF5 file means. You do
>>> this by creating a *.xdmf file, which is XML. We provide a tool
>>>
>>>   $PETSC_DIR/bin/petsc_gen_xdmf.py 
>>>
>>> which should automatically produce this file for you. Let us know if it
>>> does not work.
>>>
>>>   Thanks,
>>>
>>> Matt
>>>
>>>

 Best regards,
 Adriano.

 --
 Adriano Côrtes
 =
 *Campus Duque de Caxias and*
 *High-performance Computing Center (NACAD/COPPE)*
 Federal University of Rio de Janeiro (UFRJ)

>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>> 
>>>
>>
>>
>>
>> --
>> Adriano Côrtes
>> =
>> *Campus Duque de Caxias and*
>> *High-performance Computing Center (NACAD/COPPE)*
>> Federal University of Rio de Janeiro (UFRJ)
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
> 
>

Re: [petsc-users] EPS monitor

2017-08-16 Thread Kong, Fande

Solver details:

EPS Object: 1 MPI processes
  type: jd
search subspace is orthogonalized
block size=1
type of the initial subspace: non-Krylov
size of the subspace after restarting: 6
number of vectors after restarting from the previous iteration: 1
  problem type: generalized non-symmetric eigenvalue problem
  extraction type: harmonic Ritz
  selected portion of the spectrum: smallest eigenvalues in magnitude
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 18
  maximum dimension of projected problem (mpd): 18
  maximum number of iterations: 1
  tolerance: 0.0001
  convergence test: relative to the eigenvalue
BV Object: 1 MPI processes
  type: svec
  18 columns of global length 4225
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: Gram-Schmidt
  doing matmult as a single matrix-matrix product
DS Object: 1 MPI processes
  type: gnhep
ST Object: 1 MPI processes
  type: precond
  shift: 0.
  number of matrices: 2
  all matrices have different nonzero pattern
  KSP Object: (st_) 1 MPI processes
type: bcgsl
  Ell = 2
  Delta = 0
maximum iterations=1, initial guess is zero
tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
left preconditioning
using PRECONDITIONED norm type for convergence test
  PC Object: (st_) 1 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: () 1 MPI processes
  type: seqaij
  rows=4225, cols=4225
  total: nonzeros=37249, allocated nonzeros=37249
  total number of mallocs used during MatSetValues calls =0
not using I-node routines

On Wed, Aug 16, 2017 at 2:12 PM, Jose E. Roman <jro...@dsic.upv.es> wrote:

>
> > El 16 ago 2017, a las 21:36, Kong, Fande <fande.k...@inl.gov> escribió:
> >
> > Hi All,
> >
> > How to understand the following messages:
> >
> >   1 EPS nconv=0 first unconverged value (error) 2.06312 (3.29164033e-01)
> >   2 EPS nconv=0 first unconverged value (error) 2.03951 (1.76223074e-01)
> >   3 EPS nconv=0 first unconverged value (error) 2.01177 (5.71109559e-02)
> >   4 EPS nconv=0 first unconverged value (error) 2.01042 (4.84609300e-02)
> >   5 EPS nconv=0 first unconverged value (error) 2.00708 (3.19917457e-02)
> >   6 EPS nconv=0 first unconverged value (error) 2.00595 (2.62792109e-02)
> >   7 EPS nconv=0 first unconverged value (error) 2.00504 (2.13766150e-02)
> >   8 EPS nconv=0 first unconverged value (error) 2.00441 (1.85066774e-02)
> >   9 EPS nconv=0 first unconverged value (error) 2.00397 (1.73188449e-02)
> >  10 EPS nconv=0 first unconverged value (error) 2.00366 (1.54528517e-02)
> >  11 EPS nconv=0 first unconverged value (error) 2.00339 (1.32215899e-02)
> >  12 EPS nconv=0 first unconverged value (error) 2.00316 (1.32215899e-02)
> >  13 EPS nconv=0 first unconverged value (error) 2.00316 (1.17928920e-02)
> >  14 EPS nconv=0 first unconverged value (error) 2.00297 (1.04964387e-02)
> >  15 EPS nconv=0 first unconverged value (error) 2.0028 (9.58244972e-03)
> >  16 EPS nconv=0 first unconverged value (error) 2.00268 (9.06634973e-03)
> >  17 EPS nconv=0 first unconverged value (error) 2.00198 (3.4341e-04)
> >  18 EPS nconv=1 first unconverged value (error) 2.25718 (1.79769313e+308)
> >  18 EPS converged value (error) #0 2.00197 (5.69451918e-09)
> >
> >
> > When the solver converged, the wrong eigenvalue and the wrong residual
> are printed out. Do we design like this way?
> >
> > Fande,
>
> Is this the POWER solver? Most solvers in EPS approximate several
> eigenvalues simultaneously, but this is not the case in POWER - when one
> eigenvalue converges there is no approximation available for the next one.
>
> I will think about a simple fix.
>
> Jose
>
>

[petsc-users] EPS monitor

2017-08-16 Thread Kong, Fande

Hi All,

How to understand the following messages:

  1 EPS nconv=0 first unconverged value (error) 2.06312 (3.29164033e-01)
  2 EPS nconv=0 first unconverged value (error) 2.03951 (1.76223074e-01)
  3 EPS nconv=0 first unconverged value (error) 2.01177 (5.71109559e-02)
  4 EPS nconv=0 first unconverged value (error) 2.01042 (4.84609300e-02)
  5 EPS nconv=0 first unconverged value (error) 2.00708 (3.19917457e-02)
  6 EPS nconv=0 first unconverged value (error) 2.00595 (2.62792109e-02)
  7 EPS nconv=0 first unconverged value (error) 2.00504 (2.13766150e-02)
  8 EPS nconv=0 first unconverged value (error) 2.00441 (1.85066774e-02)
  9 EPS nconv=0 first unconverged value (error) 2.00397 (1.73188449e-02)
 10 EPS nconv=0 first unconverged value (error) 2.00366 (1.54528517e-02)
 11 EPS nconv=0 first unconverged value (error) 2.00339 (1.32215899e-02)
 12 EPS nconv=0 first unconverged value (error) 2.00316 (1.32215899e-02)
 13 EPS nconv=0 first unconverged value (error) 2.00316 (1.17928920e-02)
 14 EPS nconv=0 first unconverged value (error) 2.00297 (1.04964387e-02)
 15 EPS nconv=0 first unconverged value (error) 2.0028 (9.58244972e-03)
 16 EPS nconv=0 first unconverged value (error) 2.00268 (9.06634973e-03)
 17 EPS nconv=0 first unconverged value (error) 2.00198 (3.4341e-04)
 18 EPS nconv=1 first unconverged value (error) 2.25718 (1.79769313e+308)
 18 EPS converged value (error) #0 2.00197 (5.69451918e-09)


When the solver converged, the wrong eigenvalue and the wrong residual are
printed out. Do we design like this way?

Fande,

Re: [petsc-users] slepc trap for large matrix

2017-06-07 Thread Kong, Fande

On Wed, Jun 7, 2017 at 8:37 AM, Kannan, Ramakrishnan 
wrote:

> Barry,
>
> Thanks for the kind response. I am building slepc 3.7.3 and when I
> configure –with-64-bit-indices=1, I am getting the following error.
>
> ./configure --with-64-bit-indices=1 --prefix=/lustre/atlas/proj-
> shared/csc209/ramki/slepc
> ERROR: Invalid arguments --with-64-bit-indices=1
> Use -h for help
>

I think you need to do "configure --with-64-bit-indices=1"  for PETSc (Not
SLEPc).

Fande,


> When I run ./configure –h, I am getting the following options. Let me know
> if I am missing something.
>
> SLEPc Configure Help
> 
> 
> SLEPc:
>   --with-clean=  : Delete prior build files including
> externalpackages
>   --with-cmake=  : Enable builds with CMake (disabled by
> default)
>   --prefix=   : Specify location to install SLEPc (e.g.,
> /usr/local)
>   --DATAFILESPATH=: Specify location of datafiles (for SLEPc
> developers)
> ARPACK:
>   --download-arpack[=]  : Download and install ARPACK in SLEPc
> directory
>   --with-arpack= : Indicate if you wish to test for ARPACK
>   --with-arpack-dir=  : Indicate the directory for ARPACK
> libraries
>   --with-arpack-flags=  : Indicate comma-separated flags for
> linking ARPACK
> BLOPEX:
>   --download-blopex[=]  : Download and install BLOPEX in SLEPc
> directory
> BLZPACK:
>   --with-blzpack=: Indicate if you wish to test for BLZPACK
>   --with-blzpack-dir= : Indicate the directory for BLZPACK
> libraries
>   --with-blzpack-flags= : Indicate comma-separated flags for
> linking BLZPACK
> FEAST:
>   --with-feast=  : Indicate if you wish to test for FEAST
>   --with-feast-dir=   : Indicate the directory for FEAST libraries
>   --with-feast-flags=   : Indicate comma-separated flags for
> linking FEAST
> PRIMME:
>   --download-primme[=]  : Download and install PRIMME in SLEPc
> directory
>   --with-primme= : Indicate if you wish to test for PRIMME
>   --with-primme-dir=  : Indicate the directory for PRIMME
> libraries
>   --with-primme-flags=  : Indicate comma-separated flags for
> linking PRIMME
> TRLAN:
>   --download-trlan[=]   : Download and install TRLAN in SLEPc
> directory
>   --with-trlan=  : Indicate if you wish to test for TRLAN
>   --with-trlan-dir=   : Indicate the directory for TRLAN libraries
>   --with-trlan-flags=   : Indicate comma-separated flags for
> linking TRLAN
> SOWING:
>   --download-sowing[=]  : Download and install SOWING in SLEPc
> directory
>
> --
> Regards,
> Ramki
>
>
> On 6/6/17, 9:06 PM, "Barry Smith"  wrote:
>
>
>   The resulting matrix has something like
>
> >>> 11808*11808*1.e-6
> 14,399,953,920.036863
>
>   nonzero entries. It is possible that some integer operations are
> overflowing since C int can only go up to about 4 billion before
> overflowing.
>
>   You can building with a different PETSC_ARCH value using the
> additional ./configure option for PETSc of --with-64-bit-indices and see if
> the problem is resolved.
>
>Barry
>
>
> > On Jun 5, 2017, at 12:37 PM, Kannan, Ramakrishnan 
> wrote:
> >
> > I am running EPS for NHEP on a matrix of size 11808x11808
> and I am experiencing the attached trapped. This is a 1D row distributed
> sparse uniform random matrix with 1e-6 sparsity  over 36 processors. It
> works fine for smaller matrices of sizes with 1.2 million x 1.2 million.
> Let me know if you are looking for more information.
> >
> > --
> > Regards,
> > Ramki
> >
> > 
>
>
>
>
>

[petsc-users] log_view for the master branch

2017-05-03 Thread Kong, Fande

Hi,

I am using the current master branch. The log_view gives me the summary as
follows, and the "WARNING" box repeats three times. Are we intending to do
so?

Thanks,

Fande,



*** WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
-fCourier9' to print this document***


-- PETSc Performance Summary:
--



  ##
  ##
  #  WARNING!!!#
  ##
  #   This code was compiled with a debugging option,  #
  #   To get timing results run ./configure#
  #   using --with-debugging=no, the performance will  #
  #   be generally two or three times faster.  #
  ##
  ##


./ex29 on a arch-darwin-c-debug-master named FN604208 with 1 processor, by
kongf Wed May  3 12:28:23 2017
Using Petsc Development GIT revision: v3.7.6-3529-g76c7fe0  GIT Date:
2017-05-03 08:46:23 -0500

 Max   Max/MinAvg  Total
Time (sec):   1.350e-02  1.0   1.350e-02
Objects:  4.100e+01  1.0   4.100e+01
Flop: 3.040e+02  1.0   3.040e+02  3.040e+02
Flop/sec:2.251e+04  1.0   2.251e+04  2.251e+04
Memory:   1.576e+05  1.0  1.576e+05
MPI Messages: 0.000e+00  0.0   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00  0.0   0.000e+00  0.000e+00
MPI Reductions:   0.000e+00  0.0

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N
--> 2N flop
and VecAXPY() for complex vectors of length N
--> 8N flop

Summary of Stages:   - Time --  - Flop -  --- Messages ---
-- Message Lengths --  -- Reductions --
Avg %Total Avg %Total   counts
%Total Avg %Total   counts   %Total
 0:  Main Stage: 1.3483e-02  99.8%  3.0400e+02 100.0%  0.000e+00
0.0%  0.000e+000.0%  0.000e+00   0.0%


See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
  %T - percent time in this phase %F - percent flop in this
phase
  %M - percent messages in this phase %L - percent message lengths
in this phase
  %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over
all processors)



  ##
  ##
  #  WARNING!!!#
  ##
  #   This code was compiled with a debugging option,  #
  #   To get timing results run ./configure#
  #   using --with-debugging=no, the performance will  #
  #   be generally two or three times faster.  #
  ##
  ##


EventCount  Time (sec)
Flop --- Global ---  --- Stage ---   Total
   Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s


--- Event Stage 0: Main Stage

KSPGMRESOrthog 1 1.0 1.3617e-04 1.0 3.50e+01 1.0 0.0e+00 0.0e+00
0.0e+00  1 12  0  0  0   1 12  0  0  0 0
KSPSetUp   1 1.0 4.1097e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00

Re: [petsc-users] misleading "mpich" messages

2017-04-25 Thread Kong, Fande

Thanks, Barry and Satish,

It makes sense.

Fande,

On Tue, Apr 25, 2017 at 4:33 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Apr 25, 2017, at 5:08 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Thanks, Satish,
> >
> > One more question: will petsc complain different versions of other
> implementations such as intel MPI and IBM MPI? For example, configure with
> a version of intel MPI, and compile with another version of intel MPI. Do
> we have error messages on this?
>
>The compile time checking is in include/petscsys.h so you can easily
> see what we do do. As Satish says we can try to add more cases one at a
> time if we know unique macros used in particular mpi.h but with many cases
> the code will become messy unless there is a pattern we can organize around.
>
>
>
>
> >
> > Fande,
> >
> > On Tue, Apr 25, 2017 at 4:03 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > Added this patch to balay/add-mvapich-version-check
> >
> > Satish
> >
> > On Tue, 25 Apr 2017, Satish Balay wrote:
> >
> > > You can try the attached [untested] patch. It replicates the
> > > MPICH_NUMVERSION code and replaces it with MVAPICH2_NUMVERSION
> > >
> > > Satish
> > >
> > > On Tue, 25 Apr 2017, Kong, Fande wrote:
> > >
> > > > On Tue, Apr 25, 2017 at 3:42 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > > >
> > > > >
> > > > >The error message is generated based on the macro
> MPICH_NUMVERSION
> > > > > contained in the mpi.h file.
> > > > >
> > > > > Apparently MVAPICH also provides this macro, hence PETSc has no
> way to
> > > > > know that it is not MPICH.
> > > > >
> > > > > If you can locate in the MVAPICH mpi.h include files macros related
> > > > > explicitly to MVAPICH then we could possibly use that macro to
> provide a
> > > > > more specific error message.
> > > > >
> > > >
> > > >
> > > > There is also a macro: MVAPICH2_NUMVERSION in mpi.h. We might use it
> to
> > > > have a right message.
> > > >
> > > > Looks possible for me.
> > > >
> > > > Fande,
> > > >
> > > >
> > > >
> > > > >
> > > > >   Barry
> > > > >
> > > > >
> > > > > > On Apr 25, 2017, at 4:35 PM, Kong, Fande <fande.k...@inl.gov>
> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > We configured PETSc with a version of MVAPICH, and complied with
> another
> > > > > version of MVAPICH.  Got the error messages:
> > > > > >
> > > > > > error "PETSc was configured with one MPICH mpi.h version but now
> appears
> > > > > to be compiling using a different MPICH mpi.h version"
> > > > > >
> > > > > >
> > > > > > Why we could not say something about "MVAPICH" (not "MPICH")?
> > > > > >
> > > > > > Do we just simply consider all MPI implementations (MVAPICH,
> maybe Intel
> > > > > MPI, IBM mpi?) based on MPICH as "MPICH"?
> > > > > >
> > > > > > Fande,
> > > > >
> > > > >
> > > >
> > >
> >
> >
>
>

Re: [petsc-users] misleading "mpich" messages

2017-04-25 Thread Kong, Fande

Thanks, Satish,

One more question: will petsc complain different versions of other
implementations such as intel MPI and IBM MPI? For example, configure with
a version of intel MPI, and compile with another version of intel MPI. Do
we have error messages on this?

Fande,

On Tue, Apr 25, 2017 at 4:03 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> Added this patch to balay/add-mvapich-version-check
>
> Satish
>
> On Tue, 25 Apr 2017, Satish Balay wrote:
>
> > You can try the attached [untested] patch. It replicates the
> > MPICH_NUMVERSION code and replaces it with MVAPICH2_NUMVERSION
> >
> > Satish
> >
> > On Tue, 25 Apr 2017, Kong, Fande wrote:
> >
> > > On Tue, Apr 25, 2017 at 3:42 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > > >
> > > >The error message is generated based on the macro MPICH_NUMVERSION
> > > > contained in the mpi.h file.
> > > >
> > > > Apparently MVAPICH also provides this macro, hence PETSc has no way
> to
> > > > know that it is not MPICH.
> > > >
> > > > If you can locate in the MVAPICH mpi.h include files macros related
> > > > explicitly to MVAPICH then we could possibly use that macro to
> provide a
> > > > more specific error message.
> > > >
> > >
> > >
> > > There is also a macro: MVAPICH2_NUMVERSION in mpi.h. We might use it to
> > > have a right message.
> > >
> > > Looks possible for me.
> > >
> > > Fande,
> > >
> > >
> > >
> > > >
> > > >   Barry
> > > >
> > > >
> > > > > On Apr 25, 2017, at 4:35 PM, Kong, Fande <fande.k...@inl.gov>
> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > We configured PETSc with a version of MVAPICH, and complied with
> another
> > > > version of MVAPICH.  Got the error messages:
> > > > >
> > > > > error "PETSc was configured with one MPICH mpi.h version but now
> appears
> > > > to be compiling using a different MPICH mpi.h version"
> > > > >
> > > > >
> > > > > Why we could not say something about "MVAPICH" (not "MPICH")?
> > > > >
> > > > > Do we just simply consider all MPI implementations (MVAPICH, maybe
> Intel
> > > > MPI, IBM mpi?) based on MPICH as "MPICH"?
> > > > >
> > > > > Fande,
> > > >
> > > >
> > >
> >
>
>

Re: [petsc-users] misleading "mpich" messages

2017-04-25 Thread Kong, Fande

On Tue, Apr 25, 2017 at 3:42 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>The error message is generated based on the macro MPICH_NUMVERSION
> contained in the mpi.h file.
>
> Apparently MVAPICH also provides this macro, hence PETSc has no way to
> know that it is not MPICH.
>
> If you can locate in the MVAPICH mpi.h include files macros related
> explicitly to MVAPICH then we could possibly use that macro to provide a
> more specific error message.
>


There is also a macro: MVAPICH2_NUMVERSION in mpi.h. We might use it to
have a right message.

Looks possible for me.

Fande,



>
>   Barry
>
>
> > On Apr 25, 2017, at 4:35 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Hi,
> >
> > We configured PETSc with a version of MVAPICH, and complied with another
> version of MVAPICH.  Got the error messages:
> >
> > error "PETSc was configured with one MPICH mpi.h version but now appears
> to be compiling using a different MPICH mpi.h version"
> >
> >
> > Why we could not say something about "MVAPICH" (not "MPICH")?
> >
> > Do we just simply consider all MPI implementations (MVAPICH, maybe Intel
> MPI, IBM mpi?) based on MPICH as "MPICH"?
> >
> > Fande,
>
>

[petsc-users] misleading "mpich" messages

2017-04-25 Thread Kong, Fande

Hi,

We configured PETSc with a version of MVAPICH, and complied with another
version of MVAPICH.  Got the error messages:

*error "PETSc was configured with one MPICH mpi.h version but now appears
to be compiling using a different MPICH mpi.h version"*


Why we could not say something about "MVAPICH" (not "MPICH")?

Do we just simply consider all MPI implementations (MVAPICH, maybe Intel
MPI, IBM mpi?) based on MPICH as "MPICH"?

Fande,

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-19 Thread Kong, Fande

Thanks, Mark,

Now, the total compute time using GAMG is competitive with ASM.  Looks like
I could not use something like: "-mg_level_1_ksp_type gmres" because this
option makes the compute time much worse.

Fande,

On Thu, Apr 13, 2017 at 9:14 AM, Mark Adams <mfad...@lbl.gov> wrote:

>
>
> On Wed, Apr 12, 2017 at 7:04 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>>
>>
>> On Sun, Apr 9, 2017 at 6:04 AM, Mark Adams <mfad...@lbl.gov> wrote:
>>
>>> You seem to have two levels here and 3M eqs on the fine grid and 37 on
>>> the coarse grid. I don't understand that.
>>>
>>> You are also calling the AMG setup a lot, but not spending much time
>>> in it. Try running with -info and grep on "GAMG".
>>>
>>
>> I got the following output:
>>
>> [0] PCSetUp_GAMG(): level 0) N=3020875, n data rows=1, n data cols=1,
>> nnz/row (ave)=71, np=384
>> [0] PCGAMGFilterGraph():  100.% nnz after filtering, with threshold
>> 0., 73.6364 nnz ave. (N=3020875)
>> [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square
>> [0] PCGAMGProlongator_AGG(): New grid 18162 nodes
>> [0] PCGAMGOptProlongator_AGG(): Smooth P0: max eigen=1.978702e+00
>> min=2.559747e-02 PC=jacobi
>> [0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=384,
>> neq(loc)=40
>> [0] PCSetUp_GAMG(): 1) N=18162, n data cols=1, nnz/row (ave)=94, 384
>> active pes
>> [0] PCSetUp_GAMG(): 2 levels, grid complexity = 1.00795
>> [0] PCSetUp_GAMG(): level 0) N=3020875, n data rows=1, n data cols=1,
>> nnz/row (ave)=71, np=384
>> [0] PCGAMGFilterGraph():  100.% nnz after filtering, with threshold
>> 0., 73.6364 nnz ave. (N=3020875)
>> [0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square
>> [0] PCGAMGProlongator_AGG(): New grid 18145 nodes
>> [0] PCGAMGOptProlongator_AGG(): Smooth P0: max eigen=1.978584e+00
>> min=2.557887e-02 PC=jacobi
>> [0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=384,
>> neq(loc)=37
>> [0] PCSetUp_GAMG(): 1) N=18145, n data cols=1, nnz/row (ave)=94, 384
>> active pes
>>
>
> You are still doing two levels. Just use the parameters that I told you
> and you should see that 1) this coarsest (last) grid has "1 active pes" and
> 2) the overall solve time and overall convergence rate is much better.
>
>
>> [0] PCSetUp_GAMG(): 2 levels, grid complexity = 1.00792
>> GAMG specific options
>> PCGAMGGraph_AGG   40 1.0 8.0759e+00 1.0 3.56e+07 2.3 1.6e+06 1.9e+04
>> 7.6e+02  2  0  2  4  2   2  0  2  4  2  1170
>> PCGAMGCoarse_AGG  40 1.0 7.1698e+01 1.0 4.05e+09 2.3 4.0e+06 5.1e+04
>> 1.2e+03 18 37  5 27  3  18 37  5 27  3 14632
>> PCGAMGProl_AGG40 1.0 9.2650e-01 1.2 0.00e+00 0.0 9.8e+05 2.9e+03
>> 9.6e+02  0  0  1  0  2   0  0  1  0  2 0
>> PCGAMGPOpt_AGG40 1.0 2.4484e+00 1.0 4.72e+08 2.3 3.1e+06 2.3e+03
>> 1.9e+03  1  4  4  1  4   1  4  4  1  4 51328
>> GAMG: createProl  40 1.0 8.3786e+01 1.0 4.56e+09 2.3 9.6e+06 2.5e+04
>> 4.8e+03 21 42 12 32 10  21 42 12 32 10 14134
>> GAMG: partLevel   40 1.0 6.7755e+00 1.1 2.59e+08 2.3 2.9e+06 2.5e+03
>> 1.5e+03  2  2  4  1  3   2  2  4  1  3  9431
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>
>>> On Fri, Apr 7, 2017 at 5:29 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>>> > Thanks, Barry.
>>> >
>>> > It works.
>>> >
>>> > GAMG is three times better than ASM in terms of the number of linear
>>> > iterations, but it is five times slower than ASM. Any suggestions to
>>> improve
>>> > the performance of GAMG? Log files are attached.
>>> >
>>> > Fande,
>>> >
>>> > On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov>
>>> wrote:
>>> >>
>>> >>
>>> >> > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>>> >> >
>>> >> > Thanks, Mark and Barry,
>>> >> >
>>> >> > It works pretty wells in terms of the number of linear iterations
>>> (using
>>> >> > "-pc_gamg_sym_graph true"), but it is horrible in the compute time.
>>> I am
>>> >> > using the two-level method via "-pc_mg_levels 2". The reason why
>>> the compute
>>> >> > time is larger than other preconditioning options is that a matrix
>>> free
>>> >> > method is used i

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-12 Thread Kong, Fande

On Sun, Apr 9, 2017 at 6:04 AM, Mark Adams <mfad...@lbl.gov> wrote:

> You seem to have two levels here and 3M eqs on the fine grid and 37 on
> the coarse grid. I don't understand that.
>
> You are also calling the AMG setup a lot, but not spending much time
> in it. Try running with -info and grep on "GAMG".
>

I got the following output:

[0] PCSetUp_GAMG(): level 0) N=3020875, n data rows=1, n data cols=1,
nnz/row (ave)=71, np=384
[0] PCGAMGFilterGraph():  100.% nnz after filtering, with threshold 0.,
73.6364 nnz ave. (N=3020875)
[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square
[0] PCGAMGProlongator_AGG(): New grid 18162 nodes
[0] PCGAMGOptProlongator_AGG(): Smooth P0: max eigen=1.978702e+00
min=2.559747e-02 PC=jacobi
[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=384,
neq(loc)=40
[0] PCSetUp_GAMG(): 1) N=18162, n data cols=1, nnz/row (ave)=94, 384 active
pes
[0] PCSetUp_GAMG(): 2 levels, grid complexity = 1.00795
[0] PCSetUp_GAMG(): level 0) N=3020875, n data rows=1, n data cols=1,
nnz/row (ave)=71, np=384
[0] PCGAMGFilterGraph():  100.% nnz after filtering, with threshold 0.,
73.6364 nnz ave. (N=3020875)
[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square
[0] PCGAMGProlongator_AGG(): New grid 18145 nodes
[0] PCGAMGOptProlongator_AGG(): Smooth P0: max eigen=1.978584e+00
min=2.557887e-02 PC=jacobi
[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=384,
neq(loc)=37
[0] PCSetUp_GAMG(): 1) N=18145, n data cols=1, nnz/row (ave)=94, 384 active
pes
[0] PCSetUp_GAMG(): 2 levels, grid complexity = 1.00792
GAMG specific options
PCGAMGGraph_AGG   40 1.0 8.0759e+00 1.0 3.56e+07 2.3 1.6e+06 1.9e+04
7.6e+02  2  0  2  4  2   2  0  2  4  2  1170
PCGAMGCoarse_AGG  40 1.0 7.1698e+01 1.0 4.05e+09 2.3 4.0e+06 5.1e+04
1.2e+03 18 37  5 27  3  18 37  5 27  3 14632
PCGAMGProl_AGG40 1.0 9.2650e-01 1.2 0.00e+00 0.0 9.8e+05 2.9e+03
9.6e+02  0  0  1  0  2   0  0  1  0  2 0
PCGAMGPOpt_AGG40 1.0 2.4484e+00 1.0 4.72e+08 2.3 3.1e+06 2.3e+03
1.9e+03  1  4  4  1  4   1  4  4  1  4 51328
GAMG: createProl  40 1.0 8.3786e+01 1.0 4.56e+09 2.3 9.6e+06 2.5e+04
4.8e+03 21 42 12 32 10  21 42 12 32 10 14134
GAMG: partLevel   40 1.0 6.7755e+00 1.1 2.59e+08 2.3 2.9e+06 2.5e+03
1.5e+03  2  2  4  1  3   2  2  4  1  3  9431








>
>
> On Fri, Apr 7, 2017 at 5:29 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > Thanks, Barry.
> >
> > It works.
> >
> > GAMG is three times better than ASM in terms of the number of linear
> > iterations, but it is five times slower than ASM. Any suggestions to
> improve
> > the performance of GAMG? Log files are attached.
> >
> > Fande,
> >
> > On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >>
> >>
> >> > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >> >
> >> > Thanks, Mark and Barry,
> >> >
> >> > It works pretty wells in terms of the number of linear iterations
> (using
> >> > "-pc_gamg_sym_graph true"), but it is horrible in the compute time. I
> am
> >> > using the two-level method via "-pc_mg_levels 2". The reason why the
> compute
> >> > time is larger than other preconditioning options is that a matrix
> free
> >> > method is used in the fine level and in my particular problem the
> function
> >> > evaluation is expensive.
> >> >
> >> > I am using "-snes_mf_operator 1" to turn on the Jacobian-free Newton,
> >> > but I do not think I want to make the preconditioning part
> matrix-free.  Do
> >> > you guys know how to turn off the matrix-free method for GAMG?
> >>
> >>-pc_use_amat false
> >>
> >> >
> >> > Here is the detailed solver:
> >> >
> >> > SNES Object: 384 MPI processes
> >> >   type: newtonls
> >> >   maximum iterations=200, maximum function evaluations=1
> >> >   tolerances: relative=1e-08, absolute=1e-08, solution=1e-50
> >> >   total number of linear solver iterations=20
> >> >   total number of function evaluations=166
> >> >   norm schedule ALWAYS
> >> >   SNESLineSearch Object:   384 MPI processes
> >> > type: bt
> >> >   interpolation: cubic
> >> >   alpha=1.00e-04
> >> > maxstep=1.00e+08, minlambda=1.00e-12
> >> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> >> > lambda=1.00e-08
> >> > maximum iterations=40
> >> >   KSP Object:   384 MPI processes
> >> &g

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-12 Thread Kong, Fande

Hi Mark,

Thanks for your reply.

On Wed, Apr 12, 2017 at 9:16 AM, Mark Adams <mfad...@lbl.gov> wrote:

> The problem comes from setting the number of MG levels (-pc_mg_levels 2).
> Not your fault, it looks like the GAMG logic is faulty, in your version at
> least.
>

What I want is that GAMG coarsens the fine matrix once and then stops doing
anything.  I did not see any benefits to have more levels if the number of
processors is small.


>
> GAMG will force the coarsest grid to one processor by default, in newer
> versions. You can override the default with:
>
> -pc_gamg_use_parallel_coarse_grid_solver
>
> Your coarse grid solver is ASM with these 37 equation per process and 512
> processes. That is bad.
>

Why this is bad? The subdomain problem is too small?


> Note, you could run this on one process to see the proper convergence
> rate.
>

Convergence rate for which part? coarse solver, subdomain solver?


> You can fix this with parameters:
>
> >   -pc_gamg_process_eq_limit <50>: Limit (goal) on number of equations
> per process on coarse grids (PCGAMGSetProcEqLim)
> >   -pc_gamg_coarse_eq_limit <50>: Limit on number of equations for the
> coarse grid (PCGAMGSetCoarseEqLim)
>
> If you really want two levels then set something like
> -pc_gamg_coarse_eq_limit 18145 (or higher) -pc_gamg_coarse_eq_limit 18145
> (or higher).
>


May have something like: make the coarse problem 1/8 large as the original
problem? Otherwise, this number is just problem dependent.



> You can run with -info and grep on GAMG and you will meta-data for each
> level. you should see "npe=1" for the coarsest, last, grid. Or use a
> parallel direct solver.
>

I will try.


>
> Note, you should not see much degradation as you increase the number of
> levels. 18145 eqs on a 3D problem will probably be noticeable. I generally
> aim for about 3000.
>

It should be fine as long as the coarse problem is solved by a parallel
solver.


Fande,


>
>
> On Mon, Apr 10, 2017 at 12:17 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>>
>>
>> On Sun, Apr 9, 2017 at 6:04 AM, Mark Adams <mfad...@lbl.gov> wrote:
>>
>>> You seem to have two levels here and 3M eqs on the fine grid and 37 on
>>> the coarse grid.
>>
>>
>> 37 is on the sub domain.
>>
>>  rows=18145, cols=18145 on the entire coarse grid.
>>
>>
>>
>>
>>
>>> I don't understand that.
>>>
>>> You are also calling the AMG setup a lot, but not spending much time
>>> in it. Try running with -info and grep on "GAMG".
>>>
>>>
>>> On Fri, Apr 7, 2017 at 5:29 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>>> > Thanks, Barry.
>>> >
>>> > It works.
>>> >
>>> > GAMG is three times better than ASM in terms of the number of linear
>>> > iterations, but it is five times slower than ASM. Any suggestions to
>>> improve
>>> > the performance of GAMG? Log files are attached.
>>> >
>>> > Fande,
>>> >
>>> > On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov>
>>> wrote:
>>> >>
>>> >>
>>> >> > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>>> >> >
>>> >> > Thanks, Mark and Barry,
>>> >> >
>>> >> > It works pretty wells in terms of the number of linear iterations
>>> (using
>>> >> > "-pc_gamg_sym_graph true"), but it is horrible in the compute time.
>>> I am
>>> >> > using the two-level method via "-pc_mg_levels 2". The reason why
>>> the compute
>>> >> > time is larger than other preconditioning options is that a matrix
>>> free
>>> >> > method is used in the fine level and in my particular problem the
>>> function
>>> >> > evaluation is expensive.
>>> >> >
>>> >> > I am using "-snes_mf_operator 1" to turn on the Jacobian-free
>>> Newton,
>>> >> > but I do not think I want to make the preconditioning part
>>> matrix-free.  Do
>>> >> > you guys know how to turn off the matrix-free method for GAMG?
>>> >>
>>> >>-pc_use_amat false
>>> >>
>>> >> >
>>> >> > Here is the detailed solver:
>>> >> >
>>> >> > SNES Object: 384 MPI processes
>>> >> >   type: newtonls
>>> >> &

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-10 Thread Kong, Fande

On Sun, Apr 9, 2017 at 6:04 AM, Mark Adams <mfad...@lbl.gov> wrote:

> You seem to have two levels here and 3M eqs on the fine grid and 37 on
> the coarse grid.


37 is on the sub domain.

 rows=18145, cols=18145 on the entire coarse grid.





> I don't understand that.
>
> You are also calling the AMG setup a lot, but not spending much time
> in it. Try running with -info and grep on "GAMG".
>
>
> On Fri, Apr 7, 2017 at 5:29 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > Thanks, Barry.
> >
> > It works.
> >
> > GAMG is three times better than ASM in terms of the number of linear
> > iterations, but it is five times slower than ASM. Any suggestions to
> improve
> > the performance of GAMG? Log files are attached.
> >
> > Fande,
> >
> > On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >>
> >>
> >> > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >> >
> >> > Thanks, Mark and Barry,
> >> >
> >> > It works pretty wells in terms of the number of linear iterations
> (using
> >> > "-pc_gamg_sym_graph true"), but it is horrible in the compute time. I
> am
> >> > using the two-level method via "-pc_mg_levels 2". The reason why the
> compute
> >> > time is larger than other preconditioning options is that a matrix
> free
> >> > method is used in the fine level and in my particular problem the
> function
> >> > evaluation is expensive.
> >> >
> >> > I am using "-snes_mf_operator 1" to turn on the Jacobian-free Newton,
> >> > but I do not think I want to make the preconditioning part
> matrix-free.  Do
> >> > you guys know how to turn off the matrix-free method for GAMG?
> >>
> >>-pc_use_amat false
> >>
> >> >
> >> > Here is the detailed solver:
> >> >
> >> > SNES Object: 384 MPI processes
> >> >   type: newtonls
> >> >   maximum iterations=200, maximum function evaluations=1
> >> >   tolerances: relative=1e-08, absolute=1e-08, solution=1e-50
> >> >   total number of linear solver iterations=20
> >> >   total number of function evaluations=166
> >> >   norm schedule ALWAYS
> >> >   SNESLineSearch Object:   384 MPI processes
> >> > type: bt
> >> >   interpolation: cubic
> >> >   alpha=1.00e-04
> >> > maxstep=1.00e+08, minlambda=1.00e-12
> >> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> >> > lambda=1.00e-08
> >> > maximum iterations=40
> >> >   KSP Object:   384 MPI processes
> >> > type: gmres
> >> >   GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
> >> > Orthogonalization with no iterative refinement
> >> >   GMRES: happy breakdown tolerance 1e-30
> >> > maximum iterations=100, initial guess is zero
> >> > tolerances:  relative=0.001, absolute=1e-50, divergence=1.
> >> > right preconditioning
> >> > using UNPRECONDITIONED norm type for convergence test
> >> >   PC Object:   384 MPI processes
> >> > type: gamg
> >> >   MG: type is MULTIPLICATIVE, levels=2 cycles=v
> >> > Cycles per PCApply=1
> >> > Using Galerkin computed coarse grid matrices
> >> > GAMG specific options
> >> >   Threshold for dropping small values from graph 0.
> >> >   AGG specific options
> >> > Symmetric graph true
> >> > Coarse grid solver -- level ---
> >> >   KSP Object:  (mg_coarse_)   384 MPI processes
> >> > type: preonly
> >> > maximum iterations=1, initial guess is zero
> >> > tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
> >> > left preconditioning
> >> > using NONE norm type for convergence test
> >> >   PC Object:  (mg_coarse_)   384 MPI processes
> >> > type: bjacobi
> >> >   block Jacobi: number of blocks = 384
> >> >   Local solve is same for all blocks, in the following KSP and
> >> > PC objects:
> >> > KSP Object:(mg_coarse_sub_) 1 MPI processes
> >> >

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-07 Thread Kong, Fande

On Fri, Apr 7, 2017 at 3:52 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Apr 7, 2017, at 4:46 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Fri, Apr 7, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >
> >   Using Petsc Release Version 3.7.5, unknown
> >
> >So are you using the release or are you using master branch?
> >
> > I am working on the maint branch.
> >
> > I did something two months ago:
> >
> >  git clone -b maint https://urldefense.proofpoint.
> com/v2/url?u=https-3A__bitbucket.org_petsc_petsc=DwIFAg=
> 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
> JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=c92UNplDTVgzFrXIn_
> 70buWa2rXPGUKN083_aJYI0FQ=yrulwZxJiduZc-703r7PJOUApPDehsFIkhS0BTrroXc=
> petsc.
> >
> >
> > I am interested to improve the GAMG performance.
>
>   Why, why not use the best solver for your problem?
>

I am just curious. I want to understand the potential of interesting
preconditioners.



>
> > Is it possible? It can not beat ASM at all? The multilevel method should
> be better than the one-level if the number of processor cores is large.
>
>The ASM is taking 30 iterations, this is fantastic, it is really going
> to be tough to get GAMG to be faster (set up time for GAMG is high).
>
>What happens to both with 10 times as many processes? 100 times as many?
>


Did not try many processes yet.

Fande,



>
>
>    Barry
>
> >
> > Fande,
> >
> >
> >If you use master the ASM will be even faster.
> >
> > What's new in master?
> >
> >
> > Fande,
> >
> >
> >
> > > On Apr 7, 2017, at 4:29 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > > Thanks, Barry.
> > >
> > > It works.
> > >
> > > GAMG is three times better than ASM in terms of the number of linear
> iterations, but it is five times slower than ASM. Any suggestions to
> improve the performance of GAMG? Log files are attached.
> > >
> > > Fande,
> > >
> > > On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > > > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Thanks, Mark and Barry,
> > > >
> > > > It works pretty wells in terms of the number of linear iterations
> (using "-pc_gamg_sym_graph true"), but it is horrible in the compute time.
> I am using the two-level method via "-pc_mg_levels 2". The reason why the
> compute time is larger than other preconditioning options is that a matrix
> free method is used in the fine level and in my particular problem the
> function evaluation is expensive.
> > > >
> > > > I am using "-snes_mf_operator 1" to turn on the Jacobian-free
> Newton, but I do not think I want to make the preconditioning part
> matrix-free.  Do you guys know how to turn off the matrix-free method for
> GAMG?
> > >
> > >-pc_use_amat false
> > >
> > > >
> > > > Here is the detailed solver:
> > > >
> > > > SNES Object: 384 MPI processes
> > > >   type: newtonls
> > > >   maximum iterations=200, maximum function evaluations=1
> > > >   tolerances: relative=1e-08, absolute=1e-08, solution=1e-50
> > > >   total number of linear solver iterations=20
> > > >   total number of function evaluations=166
> > > >   norm schedule ALWAYS
> > > >   SNESLineSearch Object:   384 MPI processes
> > > > type: bt
> > > >   interpolation: cubic
> > > >   alpha=1.00e-04
> > > > maxstep=1.00e+08, minlambda=1.00e-12
> > > > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > > > maximum iterations=40
> > > >   KSP Object:   384 MPI processes
> > > > type: gmres
> > > >   GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > > >   GMRES: happy breakdown tolerance 1e-30
> > > > maximum iterations=100, initial guess is zero
> > > > tolerances:  relative=0.001, absolute=1e-50, divergence=1.
> > > > right preconditioning
> > > > using UNPRECONDITIONED norm type for convergence test
> > > >   PC Object:   384 MPI processes
> > > > type: gamg
> > > >   MG: ty

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-07 Thread Kong, Fande

Thanks, Barry.

It works.

GAMG is three times better than ASM in terms of the number of linear
iterations, but it is five times slower than ASM. Any suggestions to
improve the performance of GAMG? Log files are attached.

Fande,

On Thu, Apr 6, 2017 at 3:39 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Apr 6, 2017, at 9:39 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Thanks, Mark and Barry,
> >
> > It works pretty wells in terms of the number of linear iterations (using
> "-pc_gamg_sym_graph true"), but it is horrible in the compute time. I am
> using the two-level method via "-pc_mg_levels 2". The reason why the
> compute time is larger than other preconditioning options is that a matrix
> free method is used in the fine level and in my particular problem the
> function evaluation is expensive.
> >
> > I am using "-snes_mf_operator 1" to turn on the Jacobian-free Newton,
> but I do not think I want to make the preconditioning part matrix-free.  Do
> you guys know how to turn off the matrix-free method for GAMG?
>
>-pc_use_amat false
>
> >
> > Here is the detailed solver:
> >
> > SNES Object: 384 MPI processes
> >   type: newtonls
> >   maximum iterations=200, maximum function evaluations=1
> >   tolerances: relative=1e-08, absolute=1e-08, solution=1e-50
> >   total number of linear solver iterations=20
> >   total number of function evaluations=166
> >   norm schedule ALWAYS
> >   SNESLineSearch Object:   384 MPI processes
> > type: bt
> >   interpolation: cubic
> >   alpha=1.00e-04
> > maxstep=1.00e+08, minlambda=1.00e-12
> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > maximum iterations=40
> >   KSP Object:   384 MPI processes
> > type: gmres
> >   GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >   GMRES: happy breakdown tolerance 1e-30
> > maximum iterations=100, initial guess is zero
> > tolerances:  relative=0.001, absolute=1e-50, divergence=1.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> >   PC Object:   384 MPI processes
> > type: gamg
> >   MG: type is MULTIPLICATIVE, levels=2 cycles=v
> > Cycles per PCApply=1
> > Using Galerkin computed coarse grid matrices
> > GAMG specific options
> >   Threshold for dropping small values from graph 0.
> >   AGG specific options
> > Symmetric graph true
> > Coarse grid solver -- level ---
> >   KSP Object:  (mg_coarse_)   384 MPI processes
> > type: preonly
> > maximum iterations=1, initial guess is zero
> > tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
> > left preconditioning
> > using NONE norm type for convergence test
> >   PC Object:  (mg_coarse_)   384 MPI processes
> > type: bjacobi
> >   block Jacobi: number of blocks = 384
> >   Local solve is same for all blocks, in the following KSP and
> PC objects:
> > KSP Object:(mg_coarse_sub_) 1 MPI processes
> >   type: preonly
> >   maximum iterations=1, initial guess is zero
> >   tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
> >   left preconditioning
> >   using NONE norm type for convergence test
> > PC Object:(mg_coarse_sub_) 1 MPI processes
> >   type: lu
> > LU: out-of-place factorization
> > tolerance for zero pivot 2.22045e-14
> > using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
> > matrix ordering: nd
> > factor fill ratio given 5., needed 1.31367
> >   Factored matrix follows:
> > Mat Object: 1 MPI processes
> >   type: seqaij
> >   rows=37, cols=37
> >   package used to perform factorization: petsc
> >   total: nonzeros=913, allocated nonzeros=913
> >   total number of mallocs used during MatSetValues calls
> =0
> > not using I-node routines
> >   linear system matrix = precond matrix:
> >   Mat Object:   1 MPI processes
> > type: seqaij
> > ro

Re: [petsc-users] EPSViewer in SLEPc

2017-04-07 Thread Kong, Fande

Thanks, Jose

Fande,

On Fri, Apr 7, 2017 at 4:41 AM, Jose E. Roman <jro...@dsic.upv.es> wrote:

> I have pushed a commit that should avoid this problem.
> Jose
>
> > El 6 abr 2017, a las 22:27, Kong, Fande <fande.k...@inl.gov> escribió:
> >
> > Hi All,
> >
> > The EPSViewer in SLEPc looks weird. I do not understand the viewer
> logic. For example there is a piece of code in SLEPc (at line 225 of
> epsview.c):
> >
> > if (!ispower) {
> >   if (!eps->ds) { ierr = EPSGetDS(eps,>ds);CHKERRQ(ierr); }
> >   ierr = DSView(eps->ds,viewer);CHKERRQ(ierr);
> > }
> >
> >
> > If eps->ds is NULL, why we are going to create a new one? I just want to
> view this object. If it is NULL, you should just show me that this object
> is empty. You could print out: ds: null.
> >
> > If a user wants to develop a new EPS solver, and then register the new
> EPS to SLEPc. But the user does not want to use DS, and DSView will show
> some error messages:
> >
> > [0]PETSC ERROR: - Error Message
> --
> > [0]PETSC ERROR: Object is in wrong state
> > [0]PETSC ERROR: Requested matrix was not created in this DS
> > [0]PETSC ERROR: See https://urldefense.proofpoint.
> com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_documentation_
> faq.html=DwIFaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=
> RUH2LlACLIVsE06Hdki8z27uIfsiU8hQJ2mN6Lxo628=
> T1QKhCMs9EnX64WJhlZd0wRvwQB0W6aeVSiC6R02Gag=  for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.7.5, unknown
> > [0]PETSC ERROR: ../../../moose_test-opt on a arch-darwin-c-debug named
> FN604208 by kongf Thu Apr  6 14:22:14 2017
> > [0]PETSC ERROR: #1 DSViewMat() line 149 in /slepc/src/sys/classes/ds/
> interface/dspriv.c
> > [0]PETSC ERROR: #2 DSView_NHEP() line 47 in/slepc/src/sys/classes/ds/
> impls/nhep/dsnhep.c
> > [0]PETSC ERROR: #3 DSView() line 772 in/slepc/src/sys/classes/ds/
> interface/dsbasic.c
> > [0]PETSC ERROR: #4 EPSView() line 227 in /slepc/src/eps/interface/
> epsview.c
> > [0]PETSC ERROR: #5 PetscObjectView() line 106 in/petsc/src/sys/objects/
> destroy.c
> > [0]PETSC ERROR: #6 PetscObjectViewFromOptions() line 2808 in
> /petsc/src/sys/objects/options.c
> > [0]PETSC ERROR: #7 EPSSolve() line 159 in /slepc/src/eps/interface/
> epssolve.c
> >
> >
> >
> > Fande,
>
>

[petsc-users] EPSViewer in SLEPc

2017-04-06 Thread Kong, Fande

Hi All,

The EPSViewer in SLEPc looks weird. I do not understand the viewer logic.
For example there is a piece of code in SLEPc (at line 225 of epsview.c):





*if (!ispower) {  if (!eps->ds) { ierr =
EPSGetDS(eps,>ds);CHKERRQ(ierr); }  ierr =
DSView(eps->ds,viewer);CHKERRQ(ierr);}*

If eps->ds is NULL, why we are going to create a new one? I just want to
view this object. If it is NULL, you should just show me that this object
is empty. You could print out: ds: null.

If a user wants to develop a new EPS solver, and then register the new EPS
to SLEPc. But the user does not want to use DS, and DSView will show some
error messages:













*[0]PETSC ERROR: - Error Message
--[0]PETSC
ERROR: Object is in wrong state[0]PETSC ERROR: Requested matrix was not
created in this DS[0]PETSC ERROR: See
http://www.mcs.anl.gov/petsc/documentation/faq.html
 for trouble
shooting.[0]PETSC ERROR: Petsc Release Version 3.7.5, unknown [0]PETSC
ERROR: ../../../moose_test-opt on a arch-darwin-c-debug named FN604208 by
kongf Thu Apr  6 14:22:14 2017[0]PETSC ERROR: #1 DSViewMat() line 149 in
/slepc/src/sys/classes/ds/interface/dspriv.c[0]PETSC ERROR: #2
DSView_NHEP() line 47
in/slepc/src/sys/classes/ds/impls/nhep/dsnhep.c[0]PETSC ERROR: #3 DSView()
line 772 in/slepc/src/sys/classes/ds/interface/dsbasic.c[0]PETSC ERROR: #4
EPSView() line 227 in /slepc/src/eps/interface/epsview.c[0]PETSC ERROR: #5
PetscObjectView() line 106 in/petsc/src/sys/objects/destroy.c[0]PETSC
ERROR: #6 PetscObjectViewFromOptions() line 2808 in
/petsc/src/sys/objects/options.c[0]PETSC ERROR: #7 EPSSolve() line 159 in
/slepc/src/eps/interface/epssolve.c*



Fande,

Re: [petsc-users] GAMG for the unsymmetrical matrix

2017-04-06 Thread Kong, Fande

Thanks, Mark and Barry,

It works pretty wells in terms of the number of linear iterations (using
"-pc_gamg_sym_graph true"), but it is horrible in the compute time. I am
using the two-level method via "-pc_mg_levels 2". The reason why the
compute time is larger than other preconditioning options is that a matrix
free method is used in the fine level and in my particular problem the
function evaluation is expensive.

I am using "-snes_mf_operator 1" to turn on the Jacobian-free Newton, but I
do not think I want to make the preconditioning part matrix-free.  Do you
guys know how to turn off the matrix-free method for GAMG?

Here is the detailed solver:































































































































*SNES Object: 384 MPI processes  type: newtonls  maximum iterations=200,
maximum function evaluations=1  tolerances: relative=1e-08,
absolute=1e-08, solution=1e-50  total number of linear solver
iterations=20  total number of function evaluations=166  norm schedule
ALWAYS  SNESLineSearch Object:   384 MPI processestype: bt
interpolation: cubic  alpha=1.00e-04maxstep=1.00e+08,
minlambda=1.00e-12tolerances: relative=1.00e-08,
absolute=1.00e-15, lambda=1.00e-08maximum iterations=40  KSP
Object:   384 MPI processestype: gmres  GMRES: restart=100, using
Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative
refinement  GMRES: happy breakdown tolerance 1e-30maximum
iterations=100, initial guess is zerotolerances:  relative=0.001,
absolute=1e-50, divergence=1.right preconditioningusing
UNPRECONDITIONED norm type for convergence test  PC Object:   384 MPI
processestype: gamg  MG: type is MULTIPLICATIVE, levels=2
cycles=vCycles per PCApply=1Using Galerkin computed coarse
grid matricesGAMG specific options  Threshold for dropping
small values from graph 0.  AGG specific options
Symmetric graph trueCoarse grid solver -- level
---  KSP Object:  (mg_coarse_)
384 MPI processestype: preonlymaximum iterations=1,
initial guess is zerotolerances:  relative=1e-05, absolute=1e-50,
divergence=1.left preconditioningusing NONE norm type
for convergence test  PC Object:  (mg_coarse_)   384 MPI
processestype: bjacobi  block Jacobi: number of blocks =
384  Local solve is same for all blocks, in the following KSP and
PC objects:KSP Object:(mg_coarse_sub_) 1 MPI
processes  type: preonly  maximum iterations=1, initial
guess is zero  tolerances:  relative=1e-05, absolute=1e-50,
divergence=1.  left preconditioning  using NONE norm
type for convergence testPC Object:(mg_coarse_sub_)
1 MPI processes  type: luLU: out-of-place
factorizationtolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: ndfactor fill ratio given 5., needed
1.31367  Factored matrix follows:Mat
Object: 1 MPI processes  type:
seqaij  rows=37, cols=37  package used to
perform factorization: petsc  total: nonzeros=913,
allocated nonzeros=913  total number of mallocs used during
MatSetValues calls =0not using I-node routines
linear system matrix = precond matrix:  Mat Object:   1 MPI
processestype: seqaijrows=37, cols=37
total: nonzeros=695, allocated nonzeros=695total number of
mallocs used during MatSetValues calls =0  not using I-node
routineslinear system matrix = precond matrix:Mat
Object: 384 MPI processes  type: mpiaij
rows=18145, cols=18145  total: nonzeros=1709115, allocated
nonzeros=1709115  total number of mallocs used during MatSetValues
calls =0not using I-node (on process 0) routinesDown solver
(pre-smoother) on level 1 ---  KSP
Object:  (mg_levels_1_)   384 MPI processestype:
chebyshev  Chebyshev: eigenvalue estimates:  min = 0.19, max =
1.46673  Chebyshev: eigenvalues estimated using gmres with
translations  [0. 0.1; 0. 1.1]  KSP Object:
(mg_levels_1_esteig_)   384 MPI processestype:
gmres  GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30maximum iterations=10,
initial guess is zerotolerances:  relative=1e-12,
absolute=1e-50, divergence=1.left
preconditioningusing PRECONDITIONED norm type for convergence
test

[petsc-users] GAMG for the unsymmetrical matrix

2017-04-04 Thread Kong, Fande

Hi All,

I am using GAMG to solve a group of coupled diffusion equations, but the
resulting matrix is not symmetrical. I got the following error messages:













*[0]PETSC ERROR: Petsc has generated inconsistent data[0]PETSC ERROR: Have
un-symmetric graph (apparently). Use '-pc_gamg_sym_graph true' to symetrize
the graph or '-pc_gamg_threshold -1.0' if the matrix is structurally
symmetric.[0]PETSC ERROR: See
http://www.mcs.anl.gov/petsc/documentation/faq.html
 for trouble
shooting.[0]PETSC ERROR: Petsc Release Version 3.7.5, unknown [0]PETSC
ERROR: /home/kongf/workhome/projects/yak/yak-opt on a arch-linux2-c-opt
named r2i2n0 by kongf Mon Apr  3 16:19:59 2017[0]PETSC ERROR:
/home/kongf/workhome/projects/yak/yak-opt on a arch-linux2-c-opt named
r2i2n0 by kongf Mon Apr  3 16:19:59 2017[0]PETSC ERROR: #1 smoothAggs()
line 462 in
/home/kongf/workhome/projects/petsc/src/ksp/pc/impls/gamg/agg.c[0]PETSC
ERROR: #2 PCGAMGCoarsen_AGG() line 998 in
/home/kongf/workhome/projects/petsc/src/ksp/pc/impls/gamg/agg.c[0]PETSC
ERROR: #3 PCSetUp_GAMG() line 571 in
/home/kongf/workhome/projects/petsc/src/ksp/pc/impls/gamg/gamg.c[0]PETSC
ERROR: #3 PCSetUp_GAMG() line 571 in
/home/kongf/workhome/projects/petsc/src/ksp/pc/impls/gamg/gamg.c*
Does this mean that GAMG works for the symmetrical matrix only?

Fande,

Re: [petsc-users] coloring algorithms

2017-03-23 Thread Kong, Fande

Thanks, Barry,

On Thu, Mar 23, 2017 at 4:02 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Please send the matrix as a binary file.
>
>Are you computing a distance one coloring or distance 2. 2 is needed
> for Jacobians.
>

The matrix does not come from PDE, and it is from a grain-tracking thing.
Distance 1 did magic work. We have 8 colors now using JP, power,..

Thanks.

Fande,



>
>
> > On Mar 23, 2017, at 4:57 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Thanks, Hong,
> >
> > I did some tests with a matrix (40x40):
> >
> > row 0: (0, 1.)  (2, 1.)  (3, 1.)  (11, 1.)  (14, 1.)  (15, 1.)  (19,
> 1.)  (22, 1.)  (23, 1.)  (24, 1.)  (27, 1.)  (28, 1.)
> > row 1: (1, 1.)  (2, 1.)  (3, 1.)  (6, 1.)  (16, 1.)  (17, 1.)  (18, 1.)
> (21, 1.)  (33, 1.)
> > row 2: (0, 1.)  (1, 1.)  (2, 1.)  (3, 1.)  (6, 1.)  (7, 1.)  (9, 1.)
> (10, 1.)  (16, 1.)  (19, 1.)  (20, 1.)
> > row 3: (0, 1.)  (1, 1.)  (2, 1.)  (3, 1.)  (5, 1.)  (11, 1.)  (18, 1.)
> (19, 1.)  (21, 1.)  (22, 1.)  (31, 1.)  (33, 1.)
> > row 4: (4, 1.)  (14, 1.)  (15, 1.)  (19, 1.)  (24, 1.)  (25, 1.)  (28,
> 1.)  (30, 1.)  (37, 1.)  (38, 1.)
> > row 5: (3, 1.)  (5, 1.)  (11, 1.)  (17, 1.)  (22, 1.)  (26, 1.)  (31,
> 1.)  (32, 1.)  (33, 1.)  (34, 1.)
> > row 6: (1, 1.)  (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (16, 1.)
> (20, 1.)  (25, 1.)  (30, 1.)
> > row 7: (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (13, 1.)  (17, 1.)
> (20, 1.)  (32, 1.)  (34, 1.)
> > row 8: (8, 1.)  (9, 1.)  (12, 1.)  (13, 1.)  (26, 1.)  (29, 1.)  (30,
> 1.)  (36, 1.)  (38, 1.)  (39, 1.)
> > row 9: (2, 1.)  (6, 1.)  (7, 1.)  (8, 1.)  (9, 1.)  (10, 1.)  (13, 1.)
> (16, 1.)  (17, 1.)  (20, 1.)  (25, 1.)  (30, 1.)  (34, 1.)
> > row 10: (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (19, 1.)  (20,
> 1.)  (29, 1.)  (32, 1.)  (34, 1.)
> > row 11: (0, 1.)  (3, 1.)  (5, 1.)  (11, 1.)  (12, 1.)  (14, 1.)  (15,
> 1.)  (19, 1.)  (22, 1.)  (23, 1.)  (26, 1.)  (27, 1.)  (31, 1.)
> > row 12: (8, 1.)  (11, 1.)  (12, 1.)  (13, 1.)  (15, 1.)  (22, 1.)  (23,
> 1.)  (26, 1.)  (27, 1.)  (35, 1.)  (36, 1.)  (39, 1.)
> > row 13: (7, 1.)  (8, 1.)  (9, 1.)  (12, 1.)  (13, 1.)  (17, 1.)  (23,
> 1.)  (26, 1.)  (30, 1.)  (34, 1.)  (35, 1.)  (36, 1.)
> > row 14: (0, 1.)  (4, 1.)  (11, 1.)  (14, 1.)  (15, 1.)  (19, 1.)  (21,
> 1.)  (23, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (38, 1.)
> > row 15: (0, 1.)  (4, 1.)  (11, 1.)  (12, 1.)  (14, 1.)  (15, 1.)  (18,
> 1.)  (21, 1.)  (23, 1.)  (25, 1.)  (27, 1.)  (28, 1.)  (35, 1.)  (36, 1.)
> > row 16: (1, 1.)  (2, 1.)  (6, 1.)  (9, 1.)  (16, 1.)  (18, 1.)  (21,
> 1.)  (25, 1.)  (30, 1.)
> > row 17: (1, 1.)  (5, 1.)  (7, 1.)  (9, 1.)  (13, 1.)  (17, 1.)  (18,
> 1.)  (21, 1.)  (31, 1.)  (33, 1.)  (34, 1.)  (35, 1.)  (36, 1.)
> > row 18: (1, 1.)  (3, 1.)  (15, 1.)  (16, 1.)  (17, 1.)  (18, 1.)  (21,
> 1.)  (23, 1.)  (31, 1.)  (33, 1.)  (35, 1.)  (36, 1.)
> > row 19: (0, 1.)  (2, 1.)  (3, 1.)  (4, 1.)  (10, 1.)  (11, 1.)  (14,
> 1.)  (19, 1.)  (20, 1.)  (24, 1.)  (29, 1.)  (32, 1.)  (38, 1.)
> > row 20: (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (19, 1.)  (20, 1.)
> > row 21: (1, 1.)  (3, 1.)  (14, 1.)  (15, 1.)  (16, 1.)  (17, 1.)  (18,
> 1.)  (21, 1.)  (23, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (33, 1.)  (35, 1.)
> > row 22: (0, 1.)  (3, 1.)  (5, 1.)  (11, 1.)  (12, 1.)  (22, 1.)  (26,
> 1.)  (27, 1.)  (31, 1.)  (32, 1.)  (33, 1.)  (34, 1.)
> > row 23: (0, 1.)  (11, 1.)  (12, 1.)  (13, 1.)  (14, 1.)  (15, 1.)  (18,
> 1.)  (21, 1.)  (23, 1.)  (27, 1.)  (35, 1.)  (36, 1.)
> > row 24: (0, 1.)  (4, 1.)  (14, 1.)  (19, 1.)  (24, 1.)  (25, 1.)  (28,
> 1.)  (29, 1.)  (30, 1.)  (37, 1.)  (38, 1.)
> > row 25: (4, 1.)  (6, 1.)  (9, 1.)  (14, 1.)  (15, 1.)  (16, 1.)  (21,
> 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (37, 1.)
> > row 26: (5, 1.)  (8, 1.)  (11, 1.)  (12, 1.)  (13, 1.)  (22, 1.)  (26,
> 1.)  (27, 1.)  (29, 1.)  (32, 1.)  (39, 1.)
> > row 27: (0, 1.)  (11, 1.)  (12, 1.)  (15, 1.)  (22, 1.)  (23, 1.)  (26,
> 1.)  (27, 1.)  (35, 1.)  (36, 1.)
> > row 28: (0, 1.)  (4, 1.)  (14, 1.)  (15, 1.)  (21, 1.)  (24, 1.)  (25,
> 1.)  (28, 1.)  (30, 1.)  (37, 1.)
> > row 29: (8, 1.)  (10, 1.)  (19, 1.)  (24, 1.)  (26, 1.)  (29, 1.)  (32,
> 1.)  (34, 1.)  (38, 1.)  (39, 1.)
> > row 30: (4, 1.)  (6, 1.)  (8, 1.)  (9, 1.)  (13, 1.)  (16, 1.)  (21,
> 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (37, 1.)  (38, 1.)
> > row 31: (3, 1.)  (5, 1.)  (11, 1.)  (17, 1.)  (18, 1.)  (22, 1.)  (31,
> 1.)  (33, 1.)  (34, 1.)
> > row 32: (5, 1.)  (7, 1.)  (10, 1.)  (19, 1.)  (22, 1.)  (26, 1.)  (29,
> 1.)  (32, 1.)  (34, 1.)  (39, 1.)
> > row 33: (1, 1.)  (3, 1.)  (5, 1.)  (17, 1.)  (18, 1.)  (21, 1.)  (22,
> 1

Re: [petsc-users] coloring algorithms

2017-03-23 Thread Kong, Fande

Thanks, Hong,

I did some tests with a matrix (40x40):








































*row 0: (0, 1.)  (2, 1.)  (3, 1.)  (11, 1.)  (14, 1.)  (15, 1.)  (19, 1.)
(22, 1.)  (23, 1.)  (24, 1.)  (27, 1.)  (28, 1.) row 1: (1, 1.)  (2, 1.)
(3, 1.)  (6, 1.)  (16, 1.)  (17, 1.)  (18, 1.)  (21, 1.)  (33, 1.) row 2:
(0, 1.)  (1, 1.)  (2, 1.)  (3, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)
(16, 1.)  (19, 1.)  (20, 1.) row 3: (0, 1.)  (1, 1.)  (2, 1.)  (3, 1.)  (5,
1.)  (11, 1.)  (18, 1.)  (19, 1.)  (21, 1.)  (22, 1.)  (31, 1.)  (33, 1.)
row 4: (4, 1.)  (14, 1.)  (15, 1.)  (19, 1.)  (24, 1.)  (25, 1.)  (28, 1.)
(30, 1.)  (37, 1.)  (38, 1.) row 5: (3, 1.)  (5, 1.)  (11, 1.)  (17, 1.)
(22, 1.)  (26, 1.)  (31, 1.)  (32, 1.)  (33, 1.)  (34, 1.) row 6: (1, 1.)
(2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (16, 1.)  (20, 1.)  (25, 1.)
(30, 1.) row 7: (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)  (13, 1.)
(17, 1.)  (20, 1.)  (32, 1.)  (34, 1.) row 8: (8, 1.)  (9, 1.)  (12, 1.)
(13, 1.)  (26, 1.)  (29, 1.)  (30, 1.)  (36, 1.)  (38, 1.)  (39, 1.) row 9:
(2, 1.)  (6, 1.)  (7, 1.)  (8, 1.)  (9, 1.)  (10, 1.)  (13, 1.)  (16, 1.)
(17, 1.)  (20, 1.)  (25, 1.)  (30, 1.)  (34, 1.) row 10: (2, 1.)  (6, 1.)
(7, 1.)  (9, 1.)  (10, 1.)  (19, 1.)  (20, 1.)  (29, 1.)  (32, 1.)  (34,
1.) row 11: (0, 1.)  (3, 1.)  (5, 1.)  (11, 1.)  (12, 1.)  (14, 1.)  (15,
1.)  (19, 1.)  (22, 1.)  (23, 1.)  (26, 1.)  (27, 1.)  (31, 1.) row 12: (8,
1.)  (11, 1.)  (12, 1.)  (13, 1.)  (15, 1.)  (22, 1.)  (23, 1.)  (26, 1.)
(27, 1.)  (35, 1.)  (36, 1.)  (39, 1.) row 13: (7, 1.)  (8, 1.)  (9, 1.)
(12, 1.)  (13, 1.)  (17, 1.)  (23, 1.)  (26, 1.)  (30, 1.)  (34, 1.)  (35,
1.)  (36, 1.) row 14: (0, 1.)  (4, 1.)  (11, 1.)  (14, 1.)  (15, 1.)  (19,
1.)  (21, 1.)  (23, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (38, 1.) row 15: (0,
1.)  (4, 1.)  (11, 1.)  (12, 1.)  (14, 1.)  (15, 1.)  (18, 1.)  (21, 1.)
(23, 1.)  (25, 1.)  (27, 1.)  (28, 1.)  (35, 1.)  (36, 1.) row 16: (1, 1.)
(2, 1.)  (6, 1.)  (9, 1.)  (16, 1.)  (18, 1.)  (21, 1.)  (25, 1.)  (30, 1.)
row 17: (1, 1.)  (5, 1.)  (7, 1.)  (9, 1.)  (13, 1.)  (17, 1.)  (18, 1.)
(21, 1.)  (31, 1.)  (33, 1.)  (34, 1.)  (35, 1.)  (36, 1.) row 18: (1, 1.)
(3, 1.)  (15, 1.)  (16, 1.)  (17, 1.)  (18, 1.)  (21, 1.)  (23, 1.)  (31,
1.)  (33, 1.)  (35, 1.)  (36, 1.) row 19: (0, 1.)  (2, 1.)  (3, 1.)  (4,
1.)  (10, 1.)  (11, 1.)  (14, 1.)  (19, 1.)  (20, 1.)  (24, 1.)  (29, 1.)
(32, 1.)  (38, 1.) row 20: (2, 1.)  (6, 1.)  (7, 1.)  (9, 1.)  (10, 1.)
(19, 1.)  (20, 1.) row 21: (1, 1.)  (3, 1.)  (14, 1.)  (15, 1.)  (16, 1.)
(17, 1.)  (18, 1.)  (21, 1.)  (23, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (33,
1.)  (35, 1.) row 22: (0, 1.)  (3, 1.)  (5, 1.)  (11, 1.)  (12, 1.)  (22,
1.)  (26, 1.)  (27, 1.)  (31, 1.)  (32, 1.)  (33, 1.)  (34, 1.) row 23: (0,
1.)  (11, 1.)  (12, 1.)  (13, 1.)  (14, 1.)  (15, 1.)  (18, 1.)  (21, 1.)
(23, 1.)  (27, 1.)  (35, 1.)  (36, 1.) row 24: (0, 1.)  (4, 1.)  (14, 1.)
(19, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (29, 1.)  (30, 1.)  (37, 1.)  (38,
1.) row 25: (4, 1.)  (6, 1.)  (9, 1.)  (14, 1.)  (15, 1.)  (16, 1.)  (21,
1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (37, 1.) row 26: (5, 1.)  (8,
1.)  (11, 1.)  (12, 1.)  (13, 1.)  (22, 1.)  (26, 1.)  (27, 1.)  (29, 1.)
(32, 1.)  (39, 1.) row 27: (0, 1.)  (11, 1.)  (12, 1.)  (15, 1.)  (22, 1.)
(23, 1.)  (26, 1.)  (27, 1.)  (35, 1.)  (36, 1.) row 28: (0, 1.)  (4, 1.)
(14, 1.)  (15, 1.)  (21, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (37,
1.) row 29: (8, 1.)  (10, 1.)  (19, 1.)  (24, 1.)  (26, 1.)  (29, 1.)  (32,
1.)  (34, 1.)  (38, 1.)  (39, 1.) row 30: (4, 1.)  (6, 1.)  (8, 1.)  (9,
1.)  (13, 1.)  (16, 1.)  (21, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)
(37, 1.)  (38, 1.) row 31: (3, 1.)  (5, 1.)  (11, 1.)  (17, 1.)  (18, 1.)
(22, 1.)  (31, 1.)  (33, 1.)  (34, 1.) row 32: (5, 1.)  (7, 1.)  (10, 1.)
(19, 1.)  (22, 1.)  (26, 1.)  (29, 1.)  (32, 1.)  (34, 1.)  (39, 1.) row
33: (1, 1.)  (3, 1.)  (5, 1.)  (17, 1.)  (18, 1.)  (21, 1.)  (22, 1.)  (31,
1.)  (33, 1.)  (34, 1.)  (35, 1.) row 34: (5, 1.)  (7, 1.)  (9, 1.)  (10,
1.)  (13, 1.)  (17, 1.)  (22, 1.)  (29, 1.)  (31, 1.)  (32, 1.)  (33, 1.)
(34, 1.) row 35: (12, 1.)  (13, 1.)  (15, 1.)  (17, 1.)  (18, 1.)  (21,
1.)  (23, 1.)  (27, 1.)  (33, 1.)  (35, 1.)  (36, 1.) row 36: (8, 1.)  (12,
1.)  (13, 1.)  (15, 1.)  (17, 1.)  (18, 1.)  (23, 1.)  (27, 1.)  (35, 1.)
(36, 1.) row 37: (4, 1.)  (24, 1.)  (25, 1.)  (28, 1.)  (30, 1.)  (37, 1.)
(38, 1.) row 38: (4, 1.)  (8, 1.)  (14, 1.)  (19, 1.)  (24, 1.)  (29, 1.)
(30, 1.)  (37, 1.)  (38, 1.)  (39, 1.) row 39: (8, 1.)  (12, 1.)  (26, 1.)
(29, 1.)  (32, 1.)  (38, 1.)  (39, 1.) *


A native back-tracking gives 8 colors, but all the algorithms in PETSc give
20 colors. Is it supposed to be like this?

Fande,


On Thu, Mar 23, 2017 at 10:50 AM, Hong  wrote:

> Fande,
>>
>
>
>> I was wondering if the coloring approaches listed online are working?
>> Which ones are in parallel, and which ones are in sequential?
>>
>>

[petsc-users] coloring algorithms

2017-03-23 Thread Kong, Fande

Hi All,

I was wondering if the coloring approaches listed online are working? Which
ones are in parallel, and which ones are in sequential?

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatColoringType.html#MatColoringType

If the coloring is in parallel, can it be used with the finite difference
to compute the Jacobian? Any limitations?

Fande,

[petsc-users] KSPNormType natural

2017-03-13 Thread Kong, Fande

Hi All,

What is the definition of KSPNormType natural? It is easy to understand
none, preconditioned, and unpreconditioned, but not natural.

Fande Kong,

Re: [petsc-users] CG with right preconditioning supports NONE norm type only

2017-03-08 Thread Kong, Fande

Thanks, Barry.

Fande,

On Wed, Mar 8, 2017 at 3:55 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>A proposed fix https://urldefense.proofpoint.com/v2/url?u=https-3A__
> bitbucket.org_petsc_petsc_pull-2Drequests_645_do-2Dnot-
> 2Dassume-2Dthat-2Dall-2Dksp-2Dmethods-2Dsupport=DQIFAg=
> 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_
> JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=RbF_pG6G05IcrxiELCCV36C6Cb_
> GqQZ7H84RH1hRQik=p1nuatzGn2KrF98argO7-qTt4U64Rzny3KoN-IJLOv4=
>
>Needs Jed's approval.
>
>Barry
>
>
>
> > On Mar 8, 2017, at 10:33 AM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >
> >
> >  Please tell us how you got this output.
> >
> >  PETSc CG doesn't even implement right preconditioning. If you ask for
> it it should error out. CG supports no norm computation with left
> preconditioning.
> >
> >   Barry
> >
> >> On Mar 8, 2017, at 10:26 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >>
> >> Hi All,
> >>
> >> The NONE norm type is supported only when CG is used with a right
> preconditioner. Any reason for this?
> >>
> >>
> >>
> >> 0 Nonlinear |R| = 1.732051e+00
> >>  0 Linear |R| = 0.00e+00
> >>  1 Linear |R| = 0.00e+00
> >>  2 Linear |R| = 0.00e+00
> >>  3 Linear |R| = 0.00e+00
> >>  4 Linear |R| = 0.00e+00
> >>  5 Linear |R| = 0.00e+00
> >>  6 Linear |R| = 0.00e+00
> >> 1 Nonlinear |R| = 1.769225e-08
> >>  0 Linear |R| = 0.00e+00
> >>  1 Linear |R| = 0.00e+00
> >>  2 Linear |R| = 0.00e+00
> >>  3 Linear |R| = 0.00e+00
> >>  4 Linear |R| = 0.00e+00
> >>  5 Linear |R| = 0.00e+00
> >>  6 Linear |R| = 0.00e+00
> >>  7 Linear |R| = 0.00e+00
> >>  8 Linear |R| = 0.00e+00
> >>  9 Linear |R| = 0.00e+00
> >> 10 Linear |R| = 0.00e+00
> >> 2 Nonlinear |R| = 0.00e+00
> >> SNES Object: 1 MPI processes
> >>  type: newtonls
> >>  maximum iterations=50, maximum function evaluations=1
> >>  tolerances: relative=1e-08, absolute=1e-50, solution=1e-50
> >>  total number of linear solver iterations=18
> >>  total number of function evaluations=23
> >>  norm schedule ALWAYS
> >>  SNESLineSearch Object:   1 MPI processes
> >>type: bt
> >>  interpolation: cubic
> >>  alpha=1.00e-04
> >>maxstep=1.00e+08, minlambda=1.00e-12
> >>tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> >>maximum iterations=40
> >>  KSP Object:   1 MPI processes
> >>type: cg
> >>maximum iterations=1, initial guess is zero
> >>tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
> >>right preconditioning
> >>using NONE norm type for convergence test
> >>  PC Object:   1 MPI processes
> >>type: hypre
> >>  HYPRE BoomerAMG preconditioning
> >>  HYPRE BoomerAMG: Cycle type V
> >>  HYPRE BoomerAMG: Maximum number of levels 25
> >>  HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> >>  HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.
> >>  HYPRE BoomerAMG: Threshold for strong coupling 0.25
> >>  HYPRE BoomerAMG: Interpolation truncation factor 0.
> >>  HYPRE BoomerAMG: Interpolation: max elements per row 0
> >>  HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> >>  HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> >>  HYPRE BoomerAMG: Maximum row sums 0.9
> >>  HYPRE BoomerAMG: Sweeps down 1
> >>  HYPRE BoomerAMG: Sweeps up   1
> >>  HYPRE BoomerAMG: Sweeps on coarse1
> >>  HYPRE BoomerAMG: Relax down  symmetric-SOR/Jacobi
> >>  HYPRE BoomerAMG: Relax upsymmetric-SOR/Jacobi
> >>  HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
> >>  HYPRE BoomerAMG: Relax weight  (all)  1.
> >>  HYPRE BoomerAMG: Outer relax weight (all) 1.
> >>  HYPRE BoomerAMG: Using CF-relaxation
> >>  HYPRE BoomerAMG: Not using more complex smoothers.
> >>  HYPRE BoomerAMG: Measure typelocal
> >>  HYPRE BoomerAMG: Coarsen typeFalgout
> >>  HYPRE BoomerAMG: Interpolation type  classical
> >>linear system matrix followed by preconditioner matrix:
> >>Mat Object: 1 MPI processes
> >>  type: mffd
> >>  rows=9, cols=9
> >>Matrix-free approximation:
> >>  err=1.49012e-08 (relative error in function evaluation)
> >>  Using wp compute h routine
> >>  Does not compute normU
> >>Mat Object:() 1 MPI processes
> >>  type: seqaij
> >>  rows=9, cols=9
> >>  total: nonzeros=49, allocated nonzeros=49
> >>  total number of mallocs used during MatSetValues calls =0
> >>not using I-node routines
> >>
> >> Fande,
> >>
> >
>
>

Re: [petsc-users] CG with right preconditioning supports NONE norm type only

2017-03-08 Thread Kong, Fande

Thanks Barry,

We are using "KSPSetPCSide(ksp, pcside)" in the code.  I just tried
"-ksp_pc_side right", and petsc did not error out.

I like to understand why CG does not work with right preconditioning?
Mathematically, the right preconditioning does not make sense?

Fande,

On Wed, Mar 8, 2017 at 9:33 AM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Please tell us how you got this output.
>
>   PETSc CG doesn't even implement right preconditioning. If you ask for it
> it should error out. CG supports no norm computation with left
> preconditioning.
>
>    Barry
>
> > On Mar 8, 2017, at 10:26 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Hi All,
> >
> > The NONE norm type is supported only when CG is used with a right
> preconditioner. Any reason for this?
> >
> >
> >
> > 0 Nonlinear |R| = 1.732051e+00
> >   0 Linear |R| = 0.00e+00
> >   1 Linear |R| = 0.00e+00
> >   2 Linear |R| = 0.00e+00
> >   3 Linear |R| = 0.00e+00
> >   4 Linear |R| = 0.00e+00
> >   5 Linear |R| = 0.00e+00
> >   6 Linear |R| = 0.00e+00
> >  1 Nonlinear |R| = 1.769225e-08
> >   0 Linear |R| = 0.00e+00
> >   1 Linear |R| = 0.00e+00
> >   2 Linear |R| = 0.00e+00
> >   3 Linear |R| = 0.00e+00
> >   4 Linear |R| = 0.00e+00
> >   5 Linear |R| = 0.00e+00
> >   6 Linear |R| = 0.00e+00
> >   7 Linear |R| = 0.00e+00
> >   8 Linear |R| = 0.00e+00
> >   9 Linear |R| = 0.00e+00
> >  10 Linear |R| = 0.00e+00
> >  2 Nonlinear |R| = 0.00e+00
> > SNES Object: 1 MPI processes
> >   type: newtonls
> >   maximum iterations=50, maximum function evaluations=1
> >   tolerances: relative=1e-08, absolute=1e-50, solution=1e-50
> >   total number of linear solver iterations=18
> >   total number of function evaluations=23
> >   norm schedule ALWAYS
> >   SNESLineSearch Object:   1 MPI processes
> > type: bt
> >   interpolation: cubic
> >   alpha=1.00e-04
> > maxstep=1.00e+08, minlambda=1.00e-12
> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > maximum iterations=40
> >   KSP Object:   1 MPI processes
> > type: cg
> > maximum iterations=1, initial guess is zero
> > tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
> > right preconditioning
> > using NONE norm type for convergence test
> >   PC Object:   1 MPI processes
> > type: hypre
> >   HYPRE BoomerAMG preconditioning
> >   HYPRE BoomerAMG: Cycle type V
> >   HYPRE BoomerAMG: Maximum number of levels 25
> >   HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> >   HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.
> >   HYPRE BoomerAMG: Threshold for strong coupling 0.25
> >   HYPRE BoomerAMG: Interpolation truncation factor 0.
> >   HYPRE BoomerAMG: Interpolation: max elements per row 0
> >   HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> >   HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> >   HYPRE BoomerAMG: Maximum row sums 0.9
> >   HYPRE BoomerAMG: Sweeps down 1
> >   HYPRE BoomerAMG: Sweeps up   1
> >   HYPRE BoomerAMG: Sweeps on coarse1
> >   HYPRE BoomerAMG: Relax down  symmetric-SOR/Jacobi
> >   HYPRE BoomerAMG: Relax upsymmetric-SOR/Jacobi
> >   HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
> >   HYPRE BoomerAMG: Relax weight  (all)  1.
> >   HYPRE BoomerAMG: Outer relax weight (all) 1.
> >   HYPRE BoomerAMG: Using CF-relaxation
> >   HYPRE BoomerAMG: Not using more complex smoothers.
> >   HYPRE BoomerAMG: Measure typelocal
> >   HYPRE BoomerAMG: Coarsen typeFalgout
> >   HYPRE BoomerAMG: Interpolation type  classical
> > linear system matrix followed by preconditioner matrix:
> > Mat Object: 1 MPI processes
> >   type: mffd
> >   rows=9, cols=9
> > Matrix-free approximation:
> >   err=1.49012e-08 (relative error in function evaluation)
> >   Using wp compute h routine
> >   Does not compute normU
> > Mat Object:() 1 MPI processes
> >   type: seqaij
> >   rows=9, cols=9
> >   total: nonzeros=49, allocated nonzeros=49
> >   total number of mallocs used during MatSetValues calls =0
> > not using I-node routines
> >
> > Fande,
> >
>
>

[petsc-users] CG with right preconditioning supports NONE norm type only

2017-03-08 Thread Kong, Fande

Hi All,

The NONE norm type is supported only when CG is used with a right
preconditioner. Any reason for this?



















































































*0 Nonlinear |R| = 1.732051e+00  0 Linear |R| = 0.00e+00  1
Linear |R| = 0.00e+00  2 Linear |R| = 0.00e+00  3 Linear
|R| = 0.00e+00  4 Linear |R| = 0.00e+00  5 Linear |R| =
0.00e+00  6 Linear |R| = 0.00e+00 1 Nonlinear |R| =
1.769225e-08  0 Linear |R| = 0.00e+00  1 Linear |R| =
0.00e+00  2 Linear |R| = 0.00e+00  3 Linear |R| =
0.00e+00  4 Linear |R| = 0.00e+00  5 Linear |R| =
0.00e+00  6 Linear |R| = 0.00e+00  7 Linear |R| =
0.00e+00  8 Linear |R| = 0.00e+00  9 Linear |R| =
0.00e+00 10 Linear |R| = 0.00e+00 2 Nonlinear |R| =
0.00e+00SNES Object: 1 MPI processes  type: newtonls  maximum
iterations=50, maximum function evaluations=1  tolerances:
relative=1e-08, absolute=1e-50, solution=1e-50  total number of linear
solver iterations=18  total number of function evaluations=23  norm
schedule ALWAYS  SNESLineSearch Object:   1 MPI processestype: bt
interpolation: cubic  alpha=1.00e-04maxstep=1.00e+08,
minlambda=1.00e-12tolerances: relative=1.00e-08,
absolute=1.00e-15, lambda=1.00e-08maximum iterations=40  KSP
Object:   1 MPI processestype: cgmaximum iterations=1, initial
guess is zerotolerances:  relative=1e-05, absolute=1e-50,
divergence=1.right preconditioningusing NONE norm type for
convergence test  PC Object:   1 MPI processestype: hypre  HYPRE
BoomerAMG preconditioning  HYPRE BoomerAMG: Cycle type V  HYPRE
BoomerAMG: Maximum number of levels 25  HYPRE BoomerAMG: Maximum number
of iterations PER hypre call 1  HYPRE BoomerAMG: Convergence tolerance
PER hypre call 0.  HYPRE BoomerAMG: Threshold for strong coupling
0.25  HYPRE BoomerAMG: Interpolation truncation factor 0.  HYPRE
BoomerAMG: Interpolation: max elements per row 0  HYPRE BoomerAMG:
Number of levels of aggressive coarsening 0  HYPRE BoomerAMG: Number of
paths for aggressive coarsening 1  HYPRE BoomerAMG: Maximum row sums
0.9  HYPRE BoomerAMG: Sweeps down 1  HYPRE BoomerAMG:
Sweeps up   1  HYPRE BoomerAMG: Sweeps on coarse1
HYPRE BoomerAMG: Relax down  symmetric-SOR/Jacobi  HYPRE
BoomerAMG: Relax upsymmetric-SOR/Jacobi  HYPRE BoomerAMG:
Relax on coarse Gaussian-elimination  HYPRE BoomerAMG: Relax
weight  (all)  1.  HYPRE BoomerAMG: Outer relax weight (all)
1.  HYPRE BoomerAMG: Using CF-relaxation  HYPRE BoomerAMG: Not
using more complex smoothers.  HYPRE BoomerAMG: Measure type
local  HYPRE BoomerAMG: Coarsen typeFalgout  HYPRE
BoomerAMG: Interpolation type  classicallinear system matrix followed
by preconditioner matrix:Mat Object: 1 MPI processes  type:
mffd  rows=9, cols=9Matrix-free approximation:
err=1.49012e-08 (relative error in function evaluation)  Using wp
compute h routine  Does not compute normUMat Object:
() 1 MPI processes  type: seqaij  rows=9, cols=9  total:
nonzeros=49, allocated nonzeros=49  total number of mallocs used during
MatSetValues calls =0not using I-node routines*

Fande,

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-07 Thread Kong, Fande

I found one issue on my side. The preallocation is not right for the BAIJ
matrix.  Will this slow down MatLUFactor and MatSolve?

How to converge AIJ to BAIJ using a command-line option?

Fande,

On Tue, Mar 7, 2017 at 3:26 PM, Jed Brown <j...@jedbrown.org> wrote:

> "Kong, Fande" <fande.k...@inl.gov> writes:
>
> > On Tue, Mar 7, 2017 at 3:16 PM, Jed Brown <j...@jedbrown.org> wrote:
> >
> >> Hong <hzh...@mcs.anl.gov> writes:
> >>
> >> > Fande,
> >> > Got it. Below are what I get:
> >>
> >> Is Fande using ILU(0) or ILU(k)?  (And I think it should be possible to
> >> get a somewhat larger benefit.)
> >>
> >
> >
> > I am using ILU(0). Will it be much better to use ILU(k>0)?
>
> It'll be slower, but might converge faster.  You asked about ILU(k) so I
> assumed you were interested in k>0.
>

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-07 Thread Kong, Fande

On Tue, Mar 7, 2017 at 3:16 PM, Jed Brown <j...@jedbrown.org> wrote:

> Hong <hzh...@mcs.anl.gov> writes:
>
> > Fande,
> > Got it. Below are what I get:
>
> Is Fande using ILU(0) or ILU(k)?  (And I think it should be possible to
> get a somewhat larger benefit.)
>


I am using ILU(0). Will it be much better to use ILU(k>0)?

Fande,



>
> > petsc/src/ksp/ksp/examples/tutorials (master)
> > $ ./ex10 -f0 binaryoutput -rhs 0 -mat_view ascii::ascii_info
> > Mat Object: 1 MPI processes
> >   type: seqaij
> >   rows=8019, cols=8019, bs=11
> >   total: nonzeros=1890625, allocated nonzeros=1890625
> >   total number of mallocs used during MatSetValues calls =0
> > using I-node routines: found 2187 nodes, limit used is 5
> > Number of iterations =   3
> > Residual norm 0.00200589
> >
> > -mat_type aij
> > MatMult4 1.0 8.3621e-03 1.0 1.51e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  6  7  0  0  0   7  7  0  0  0  1805
> > MatSolve   4 1.0 8.3971e-03 1.0 1.51e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  6  7  0  0  0   7  7  0  0  0  1797
> > MatLUFactorNum 1 1.0 8.6171e-02 1.0 1.80e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00 57 85  0  0  0  70 85  0  0  0  2086
> > MatILUFactorSym1 1.0 1.4951e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 10  0  0  0  0  12  0  0  0  0 0
> >
> > -mat_type baij
> > MatMult4 1.0 5.5540e-03 1.0 1.51e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  4  5  0  0  0   7  5  0  0  0  2718
> > MatSolve   4 1.0 7.0803e-03 1.0 1.48e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  5  5  0  0  0   8  5  0  0  0  2086
> > MatLUFactorNum 1 1.0 6.0118e-02 1.0 2.55e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00 42 89  0  0  0  72 89  0  0  0  4241
> > MatILUFactorSym1 1.0 6.7251e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  5  0  0  0  0   8  0  0  0  0 0
> >
> > I ran it on my macpro. baij is faster than aij in all routines.
> >
> > Hong
> >
> > On Tue, Mar 7, 2017 at 2:26 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >> Uploaded to google drive, and sent you links in another email. Not sure
> if
> >> it works or not.
> >>
> >> Fande,
> >>
> >> On Tue, Mar 7, 2017 at 12:29 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >>
> >>>
> >>>It is too big for email you can post it somewhere so we can download
> >>> it.
> >>>
> >>>
> >>>
> >>> > On Mar 7, 2017, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >>> >
> >>> >
> >>> >
> >>> > On Tue, Mar 7, 2017 at 10:23 AM, Hong <hzh...@mcs.anl.gov> wrote:
> >>> > I checked
> >>> > MatILUFactorSymbolic_SeqBAIJ() and MatILUFactorSymbolic_SeqAIJ(),
> >>> > they are virtually same. Why the version for BAIJ is so much slower?
> >>> > I'll investigate it.
> >>> >
> >>> > Fande,
> >>> > How large is your matrix? Is it possible to send us your matrix so I
> >>> can test it?
> >>> >
> >>> > Thanks, Hong,
> >>> >
> >>> > It is a 3020875x3020875 matrix, and it is large. I can make a small
> one
> >>> if you like, but not sure it will reproduce this issue or not.
> >>> >
> >>> > Fande,
> >>> >
> >>> >
> >>> >
> >>> > Hong
> >>> >
> >>> >
> >>> > On Mon, Mar 6, 2017 at 9:08 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >>> >
> >>> >   Thanks. Even the symbolic is slower for BAIJ. I don't like that, it
> >>> definitely should not be since it is (at least should be) doing a
> symbolic
> >>> factorization on a symbolic matrix 1/11th the size!
> >>> >
> >>> >Keep us informed.
> >>> >
> >>> >
> >>> >
> >>> > > On Mar 6, 2017, at 5:44 PM, Kong, Fande <fande.k...@inl.gov>
> wrote:
> >>> > >
> >>> > > Thanks, Barry,
> >>> > >
> >>> > > Log info:
> >>> > >
> >>> > > AIJ:
> >>> > >
> >>> > > MatSolve 850 1.0 8.6543e+00 4.2 3.04e+09 1.8 0.0e+00
> >>> 0.0e+00 0.0e+00  0 41  0  0  0   0 41  0  0  0 49594
> >>> > > MatLUFactorNum25 1.0 1.7622e+

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-07 Thread Kong, Fande

On Tue, Mar 7, 2017 at 2:07 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>The matrix is too small. Please post ONE big matrix
>

I am using "-ksp_view_pmat  binary" to save the matrix. How can I save the
latest one only for a time-dependent problem?


Fande,



>
> > On Mar 7, 2017, at 2:26 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Uploaded to google drive, and sent you links in another email. Not sure
> if it works or not.
> >
> > Fande,
> >
> > On Tue, Mar 7, 2017 at 12:29 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >
> >It is too big for email you can post it somewhere so we can download
> it.
> >
> >
> > > On Mar 7, 2017, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > >
> > >
> > > On Tue, Mar 7, 2017 at 10:23 AM, Hong <hzh...@mcs.anl.gov> wrote:
> > > I checked
> > > MatILUFactorSymbolic_SeqBAIJ() and MatILUFactorSymbolic_SeqAIJ(),
> > > they are virtually same. Why the version for BAIJ is so much slower?
> > > I'll investigate it.
> > >
> > > Fande,
> > > How large is your matrix? Is it possible to send us your matrix so I
> can test it?
> > >
> > > Thanks, Hong,
> > >
> > > It is a 3020875x3020875 matrix, and it is large. I can make a small
> one if you like, but not sure it will reproduce this issue or not.
> > >
> > > Fande,
> > >
> > >
> > >
> > > Hong
> > >
> > >
> > > On Mon, Mar 6, 2017 at 9:08 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > >   Thanks. Even the symbolic is slower for BAIJ. I don't like that, it
> definitely should not be since it is (at least should be) doing a symbolic
> factorization on a symbolic matrix 1/11th the size!
> > >
> > >Keep us informed.
> > >
> > >
> > >
> > > > On Mar 6, 2017, at 5:44 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Thanks, Barry,
> > > >
> > > > Log info:
> > > >
> > > > AIJ:
> > > >
> > > > MatSolve 850 1.0 8.6543e+00 4.2 3.04e+09 1.8 0.0e+00
> 0.0e+00 0.0e+00  0 41  0  0  0   0 41  0  0  0 49594
> > > > MatLUFactorNum25 1.0 1.7622e+00 2.0 2.04e+09 2.1 0.0e+00
> 0.0e+00 0.0e+00  0 26  0  0  0   0 26  0  0  0 153394
> > > > MatILUFactorSym   13 1.0 2.8002e-01 2.9 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > > >
> > > > BAIJ:
> > > >
> > > > MatSolve 826 1.0 1.3016e+01 1.7 1.42e+10 1.8 0.0e+00
> 0.0e+00 0.0e+00  1 29  0  0  0   1 29  0  0  0 154617
> > > > MatLUFactorNum25 1.0 1.5503e+01 2.0 3.55e+10 2.1 0.0e+00
> 0.0e+00 0.0e+00  1 67  0  0  0   1 67  0  0  0 303190
> > > > MatILUFactorSym   13 1.0 5.7561e-01 1.8 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > > >
> > > > It looks like both MatSolve and MatLUFactorNum are slower.
> > > >
> > > > I will try your suggestions.
> > > >
> > > > Fande
> > > >
> > > > On Mon, Mar 6, 2017 at 4:14 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > > >
> > > >   Note also that if the 11 by 11 blocks are actually sparse (and you
> don't store all the zeros in the blocks in the AIJ format) then then AIJ
> non-block factorization involves less floating point operations and less
> memory access so can be faster than the BAIJ format, depending on "how
> sparse" the blocks are. If you actually "fill in" the 11 by 11 blocks with
> AIJ (with zeros maybe in certain locations) then the above is not true.
> > > >
> > > >
> > > > > On Mar 6, 2017, at 5:10 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > > > >
> > > > >
> > > > >   This is because for block size 11 it is using calls to
> LAPACK/BLAS for the block operations instead of custom routines for that
> block size.
> > > > >
> > > > >   Here is what you need to do. For a good sized case run both with
> -log_view and check the time spent in
> > > > > MatLUFactorNumeric, MatLUFactorSymbolic and in MatSolve for AIJ
> and BAIJ. If they have a different number of function calls then divide by
> the function call count to determine the time per function call.
> > > > >
> > > > >   This will tell you

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-07 Thread Kong, Fande

Uploaded to google drive, and sent you links in another email. Not sure if
it works or not.

Fande,

On Tue, Mar 7, 2017 at 12:29 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>It is too big for email you can post it somewhere so we can download it.
>
>
> > On Mar 7, 2017, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Tue, Mar 7, 2017 at 10:23 AM, Hong <hzh...@mcs.anl.gov> wrote:
> > I checked
> > MatILUFactorSymbolic_SeqBAIJ() and MatILUFactorSymbolic_SeqAIJ(),
> > they are virtually same. Why the version for BAIJ is so much slower?
> > I'll investigate it.
> >
> > Fande,
> > How large is your matrix? Is it possible to send us your matrix so I can
> test it?
> >
> > Thanks, Hong,
> >
> > It is a 3020875x3020875 matrix, and it is large. I can make a small one
> if you like, but not sure it will reproduce this issue or not.
> >
> > Fande,
> >
> >
> >
> > Hong
> >
> >
> > On Mon, Mar 6, 2017 at 9:08 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >
> >   Thanks. Even the symbolic is slower for BAIJ. I don't like that, it
> definitely should not be since it is (at least should be) doing a symbolic
> factorization on a symbolic matrix 1/11th the size!
> >
> >Keep us informed.
> >
> >
> >
> > > On Mar 6, 2017, at 5:44 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > > Thanks, Barry,
> > >
> > > Log info:
> > >
> > > AIJ:
> > >
> > > MatSolve 850 1.0 8.6543e+00 4.2 3.04e+09 1.8 0.0e+00
> 0.0e+00 0.0e+00  0 41  0  0  0   0 41  0  0  0 49594
> > > MatLUFactorNum25 1.0 1.7622e+00 2.0 2.04e+09 2.1 0.0e+00
> 0.0e+00 0.0e+00  0 26  0  0  0   0 26  0  0  0 153394
> > > MatILUFactorSym   13 1.0 2.8002e-01 2.9 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > >
> > > BAIJ:
> > >
> > > MatSolve 826 1.0 1.3016e+01 1.7 1.42e+10 1.8 0.0e+00
> 0.0e+00 0.0e+00  1 29  0  0  0   1 29  0  0  0 154617
> > > MatLUFactorNum25 1.0 1.5503e+01 2.0 3.55e+10 2.1 0.0e+00
> 0.0e+00 0.0e+00  1 67  0  0  0   1 67  0  0  0 303190
> > > MatILUFactorSym   13 1.0 5.7561e-01 1.8 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > >
> > > It looks like both MatSolve and MatLUFactorNum are slower.
> > >
> > > I will try your suggestions.
> > >
> > > Fande
> > >
> > > On Mon, Mar 6, 2017 at 4:14 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > >   Note also that if the 11 by 11 blocks are actually sparse (and you
> don't store all the zeros in the blocks in the AIJ format) then then AIJ
> non-block factorization involves less floating point operations and less
> memory access so can be faster than the BAIJ format, depending on "how
> sparse" the blocks are. If you actually "fill in" the 11 by 11 blocks with
> AIJ (with zeros maybe in certain locations) then the above is not true.
> > >
> > >
> > > > On Mar 6, 2017, at 5:10 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> > > >
> > > >
> > > >   This is because for block size 11 it is using calls to LAPACK/BLAS
> for the block operations instead of custom routines for that block size.
> > > >
> > > >   Here is what you need to do. For a good sized case run both with
> -log_view and check the time spent in
> > > > MatLUFactorNumeric, MatLUFactorSymbolic and in MatSolve for AIJ and
> BAIJ. If they have a different number of function calls then divide by the
> function call count to determine the time per function call.
> > > >
> > > >   This will tell you which routine needs to be optimized first
> either MatLUFactorNumeric or MatSolve. My guess is MatSolve.
> > > >
> > > >   So edit src/mat/impls/baij/seq/baijsolvnat.c and copy the
> function MatSolve_SeqBAIJ_15_NaturalOrdering_ver1() to a new function
> MatSolve_SeqBAIJ_11_NaturalOrdering_ver1. Edit the new function for the
> block size of 11.
> > > >
> > > >   Now edit MatLUFactorNumeric_SeqBAIJ_N() so that if block size is
> 11 it uses the new routine something like.
> > > >
> > > > if (both_identity) {
> > > >   if (b->bs == 11)
> > > >C->ops->solve = MatSolve_SeqBAIJ_11_NaturalOrdering_ver1;
> > > >   } else {
> > > >C->ops->solve = MatSolve_SeqBAIJ_N_

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-07 Thread Kong, Fande

On Tue, Mar 7, 2017 at 10:23 AM, Hong <hzh...@mcs.anl.gov> wrote:

> I checked
> MatILUFactorSymbolic_SeqBAIJ() and MatILUFactorSymbolic_SeqAIJ(),
> they are virtually same. Why the version for BAIJ is so much slower?
> I'll investigate it.
>

> Fande,
> How large is your matrix? Is it possible to send us your matrix so I can
> test it?
>

Thanks, Hong,

It is a 3020875x3020875 matrix, and it is large. I can make a small one if
you like, but not sure it will reproduce this issue or not.

Fande,



>
> Hong
>
>
> On Mon, Mar 6, 2017 at 9:08 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>
>>
>>   Thanks. Even the symbolic is slower for BAIJ. I don't like that, it
>> definitely should not be since it is (at least should be) doing a symbolic
>> factorization on a symbolic matrix 1/11th the size!
>>
>>Keep us informed.
>>
>>
>>
>> > On Mar 6, 2017, at 5:44 PM, Kong, Fande <fande.k...@inl.gov> wrote:
>> >
>> > Thanks, Barry,
>> >
>> > Log info:
>> >
>> > AIJ:
>> >
>> > MatSolve 850 1.0 8.6543e+00 4.2 3.04e+09 1.8 0.0e+00
>> 0.0e+00 0.0e+00  0 41  0  0  0   0 41  0  0  0 49594
>> > MatLUFactorNum25 1.0 1.7622e+00 2.0 2.04e+09 2.1 0.0e+00
>> 0.0e+00 0.0e+00  0 26  0  0  0   0 26  0  0  0 153394
>> > MatILUFactorSym   13 1.0 2.8002e-01 2.9 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
>> >
>> > BAIJ:
>> >
>> > MatSolve 826 1.0 1.3016e+01 1.7 1.42e+10 1.8 0.0e+00
>> 0.0e+00 0.0e+00  1 29  0  0  0   1 29  0  0  0 154617
>> > MatLUFactorNum25 1.0 1.5503e+01 2.0 3.55e+10 2.1 0.0e+00
>> 0.0e+00 0.0e+00  1 67  0  0  0   1 67  0  0  0 303190
>> > MatILUFactorSym   13 1.0 5.7561e-01 1.8 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
>> >
>> > It looks like both MatSolve and MatLUFactorNum are slower.
>> >
>> > I will try your suggestions.
>> >
>> > Fande
>> >
>> > On Mon, Mar 6, 2017 at 4:14 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>> >
>> >   Note also that if the 11 by 11 blocks are actually sparse (and you
>> don't store all the zeros in the blocks in the AIJ format) then then AIJ
>> non-block factorization involves less floating point operations and less
>> memory access so can be faster than the BAIJ format, depending on "how
>> sparse" the blocks are. If you actually "fill in" the 11 by 11 blocks with
>> AIJ (with zeros maybe in certain locations) then the above is not true.
>> >
>> >
>> > > On Mar 6, 2017, at 5:10 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>> > >
>> > >
>> > >   This is because for block size 11 it is using calls to LAPACK/BLAS
>> for the block operations instead of custom routines for that block size.
>> > >
>> > >   Here is what you need to do. For a good sized case run both with
>> -log_view and check the time spent in
>> > > MatLUFactorNumeric, MatLUFactorSymbolic and in MatSolve for AIJ and
>> BAIJ. If they have a different number of function calls then divide by the
>> function call count to determine the time per function call.
>> > >
>> > >   This will tell you which routine needs to be optimized first either
>> MatLUFactorNumeric or MatSolve. My guess is MatSolve.
>> > >
>> > >   So edit src/mat/impls/baij/seq/baijsolvnat.c and copy the function
>> MatSolve_SeqBAIJ_15_NaturalOrdering_ver1() to a new function
>> MatSolve_SeqBAIJ_11_NaturalOrdering_ver1. Edit the new function for the
>> block size of 11.
>> > >
>> > >   Now edit MatLUFactorNumeric_SeqBAIJ_N() so that if block size is 11
>> it uses the new routine something like.
>> > >
>> > > if (both_identity) {
>> > >   if (b->bs == 11)
>> > >C->ops->solve = MatSolve_SeqBAIJ_11_NaturalOrdering_ver1;
>> > >   } else {
>> > >C->ops->solve = MatSolve_SeqBAIJ_N_NaturalOrdering;
>> > >   }
>> > >
>> > >   Rerun and look at the new -log_view. Send all three -log_view to
>> use at this point.  If this optimization helps and now
>> > > MatLUFactorNumeric is the time sink you can do the process to
>> MatLUFactorNumeric_SeqBAIJ_15_NaturalOrdering() to make an 11 size block
>> custom version.
>> > >
>> > >  Barry
>> > >
>> > >> On Mar 6, 2017, at 4:3

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-06 Thread Kong, Fande

Thanks, Barry,

Log info:

AIJ:

MatSolve 850 1.0 8.6543e+00 4.2 3.04e+09 1.8 0.0e+00 0.0e+00
0.0e+00  0 41  0  0  0   0 41  0  0  0 49594
MatLUFactorNum25 1.0 1.7622e+00 2.0 2.04e+09 2.1 0.0e+00 0.0e+00
0.0e+00  0 26  0  0  0   0 26  0  0  0 153394
MatILUFactorSym   13 1.0 2.8002e-01 2.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0

BAIJ:

MatSolve 826 1.0 1.3016e+01 1.7 1.42e+10 1.8 0.0e+00 0.0e+00
0.0e+00  1 29  0  0  0   1 29  0  0  0 154617
MatLUFactorNum25 1.0 1.5503e+01 2.0 3.55e+10 2.1 0.0e+00 0.0e+00
0.0e+00  1 67  0  0  0   1 67  0  0  0 303190
MatILUFactorSym   13 1.0 5.7561e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0

It looks like both MatSolve and MatLUFactorNum are slower.

I will try your suggestions.

Fande

On Mon, Mar 6, 2017 at 4:14 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Note also that if the 11 by 11 blocks are actually sparse (and you don't
> store all the zeros in the blocks in the AIJ format) then then AIJ
> non-block factorization involves less floating point operations and less
> memory access so can be faster than the BAIJ format, depending on "how
> sparse" the blocks are. If you actually "fill in" the 11 by 11 blocks with
> AIJ (with zeros maybe in certain locations) then the above is not true.
>
>
> > On Mar 6, 2017, at 5:10 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> >
> >
> >   This is because for block size 11 it is using calls to LAPACK/BLAS for
> the block operations instead of custom routines for that block size.
> >
> >   Here is what you need to do. For a good sized case run both with
> -log_view and check the time spent in
> > MatLUFactorNumeric, MatLUFactorSymbolic and in MatSolve for AIJ and
> BAIJ. If they have a different number of function calls then divide by the
> function call count to determine the time per function call.
> >
> >   This will tell you which routine needs to be optimized first either
> MatLUFactorNumeric or MatSolve. My guess is MatSolve.
> >
> >   So edit src/mat/impls/baij/seq/baijsolvnat.c and copy the function
> MatSolve_SeqBAIJ_15_NaturalOrdering_ver1() to a new function
> MatSolve_SeqBAIJ_11_NaturalOrdering_ver1. Edit the new function for the
> block size of 11.
> >
> >   Now edit MatLUFactorNumeric_SeqBAIJ_N() so that if block size is 11 it
> uses the new routine something like.
> >
> > if (both_identity) {
> >   if (b->bs == 11)
> >C->ops->solve = MatSolve_SeqBAIJ_11_NaturalOrdering_ver1;
> >   } else {
> >C->ops->solve = MatSolve_SeqBAIJ_N_NaturalOrdering;
> >   }
> >
> >   Rerun and look at the new -log_view. Send all three -log_view to use
> at this point.  If this optimization helps and now
> > MatLUFactorNumeric is the time sink you can do the process to
> MatLUFactorNumeric_SeqBAIJ_15_NaturalOrdering() to make an 11 size block
> custom version.
> >
> >  Barry
> >
> >> On Mar 6, 2017, at 4:32 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >>
> >>
> >>
> >> On Mon, Mar 6, 2017 at 3:27 PM, Patrick Sanan <patrick.sa...@gmail.com>
> wrote:
> >> On Mon, Mar 6, 2017 at 1:48 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >>> Hi All,
> >>>
> >>> I am solving a nonlinear system whose Jacobian matrix has a block
> structure.
> >>> More precisely, there is a mesh, and for each vertex there are 11
> variables
> >>> associated with it. I am using BAIJ.
> >>>
> >>> I thought block ILU(k) should be more efficient than the point-wise
> ILU(k).
> >>> After some numerical experiments, I found that the block ILU(K) is much
> >>> slower than the point-wise version.
> >> Do you mean that it takes more iterations to converge, or that the
> >> time per iteration is greater, or both?
> >>
> >> The number of iterations is very similar, but the timer per iteration
> is greater.
> >>
> >>
> >>>
> >>> Any thoughts?
> >>>
> >>> Fande,
> >>
> >
>
>

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-06 Thread Kong, Fande

On Mon, Mar 6, 2017 at 3:27 PM, Patrick Sanan <patrick.sa...@gmail.com>
wrote:

> On Mon, Mar 6, 2017 at 1:48 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > Hi All,
> >
> > I am solving a nonlinear system whose Jacobian matrix has a block
> structure.
> > More precisely, there is a mesh, and for each vertex there are 11
> variables
> > associated with it. I am using BAIJ.
> >
> > I thought block ILU(k) should be more efficient than the point-wise
> ILU(k).
> > After some numerical experiments, I found that the block ILU(K) is much
> > slower than the point-wise version.
> Do you mean that it takes more iterations to converge, or that the
> time per iteration is greater, or both?
>

The number of iterations is very similar, but the timer per iteration is
greater.



> >
> > Any thoughts?
> >
> > Fande,
>

[petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-06 Thread Kong, Fande

Hi All,

I am solving a nonlinear system whose Jacobian matrix has a block
structure. More precisely, there is a mesh, and for each vertex there are
11 variables associated with it. I am using BAIJ.

I thought block ILU(k) should be more efficient than the point-wise ILU(k).
After some numerical experiments, I found that the block ILU(K) is much
slower than the point-wise version.

Any thoughts?

Fande,

Re: [petsc-users] consider an error message as a petscinfo??

2017-02-24 Thread Kong, Fande

Thanks a lot for your explanation, Barry,

This makes sense!

Fande,

On Fri, Feb 24, 2017 at 1:56 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> Fande,
>
> Yes. Say one is doing a timestepping and using a direct solver. With
> time stepping we do not want to necessarily stop (in fact we almost never)
> on a failed solve (due to for example a failed factorization). We want the
> code to continue up the stack until it gets to the time stepper with the
> information that the linear solver failed; then the timestepper can decide
> what to do, decrease the time step and try again (common) or switch to
> another method etc. The same could be true for linear solvers called from
> within optimization routines.
>
> The reason we do the funny business with putting infinity into the
> vector is so that errors that occur on any process get propagated to all
> processes during the next inner product or norm computation, otherwise we
> would need to have some "side channel" communication which MPI doesn't
> really provide to propagate errors to all processes, this would be a lot of
> performance crippling communication that would have to take place during
> the entire solution process. Plus our approach since it doesn't change the
> flow of control cannot result in lost memory etc. Note that C (and MPI)
> don't have any exception mechanism that could be used to handle these
> "exceptional cases" directly.
>
>Note that you can use options like -snes_error_if_not_converged or
> -ksp_error_if_not_converged to force the error to be set as soon as any
> problem with convergence or zero factorization is found (good for
> debugging, but not usually for production unless production is solving a
> linear system only).
>
>Barry
>
> > On Feb 24, 2017, at 2:40 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > Hi All,
> >
> > In MatSolve(), there is a piece of code:
> >
> >   if (mat->errortype) {
> > ierr = PetscInfo1(mat,"MatFactorError %D\n",mat->errortype);CHKERRQ(
> ierr);
> > ierr = VecSetInf(x);CHKERRQ(ierr);
> >   } else {
> > ierr = (*mat->ops->solve)(mat,b,x);CHKERRQ(ierr);
> >   }
> >
> > If a direct solver such as LU or superlu fails to create a
> factorization, why we do not stop here and throw out an error message here
> or earlier?  Now we just let solver keep doing garbage computation, and
> finally we have a solution like this:
> >
> > Vec Object: 1 MPI processes
> >   type: mpi
> > Process [0]
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> > inf.
> >
> > Any particular reason to handle the thing in this way?
> >
> > Fande,
> >
> >
>
>

[petsc-users] consider an error message as a petscinfo??

2017-02-24 Thread Kong, Fande

Hi All,

In MatSolve(), there is a piece of code:







*  if (mat->errortype) {ierr = PetscInfo1(mat,"MatFactorError
%D\n",mat->errortype);CHKERRQ(ierr);ierr = VecSetInf(x);CHKERRQ(ierr);
} else {ierr = (*mat->ops->solve)(mat,b,x);CHKERRQ(ierr);  }*
If a direct solver such as LU or superlu fails to create a factorization,
why we do not stop here and throw out an error message here or earlier?
Now we just let solver keep doing garbage computation, and finally we have
a solution like this:



















*Vec Object: 1 MPI processes  type: mpiProcess
[0]inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.inf.*

Any particular reason to handle the thing in this way?

Fande,

[petsc-users] lock matrices in EPS

2017-02-13 Thread Kong, Fande

Hi ALL,

I am solving a generalized eigenvalue problem Ax=\lambda Bx. I want to
retrieve matrices A and B from EPS, but got the following errors:

[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Cannot retrieve original matrices (have been modified)
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.5, unknown
[0]PETSC ERROR: ./ex14 on a arch-darwin-c-debug named FN604208 by kongf Mon
Feb 13 15:29:33 2017
[0]PETSC ERROR: Configure options --with-clanguage=cxx
--with-shared-libraries=1 --with-blaslapack=1 --with-mpi=1
--download-parmetis=1 --download-metis=1 --with-debugging=yes
--with-c2html=0 --download-hypre=1 --download-superlu_dist=1
PETSC_ARCH=arch-darwin-c-debug
[0]PETSC ERROR: #1 STGetOperators() line 308 in
/Users/kongf/projects/slepc/src/sys/classes/st/interface/stfunc.c
[0]PETSC ERROR: #2 EPSGetOperators() line 319 in
/Users/kongf/projects/slepc/src/eps/interface/epssetup.c


My question is how to lock the original matrices I passed to EPS.
"-st_matmode copy" does not help.

Fande Kong,

Re: [petsc-users] DMPlexCreateExodusFromFile() does not work on macOS Sierra

2017-01-24 Thread Kong, Fande

Thanks, Blaise and Matt,

I possibly mess up the brew installation somehow.  I will clean up
everything and start over, and report back.

Fande,

On Tue, Jan 24, 2017 at 9:26 AM, Matthew Knepley  wrote:

> Thanks!
>
>Matt
>
> On Tue, Jan 24, 2017 at 8:57 AM, Blaise A Bourdin  wrote:
>
>> Hi,
>>
>> I was able to build petsc master with home-brew gcc 6.3 and intel 17.0
>> under macOS sierra. As Mark said, it is important to reinstall the entire
>> home-brew compiler + libs after upgrading macOS (and often Xcode).
>> I am able to read / write exodus files. I am attaching my configure
>> command lines:
>>
>> with intel 17.0:
>> ./configure ./configure   \
>> --download-chaco=1\
>> --download-exodusii=1 \
>> --download-hdf5=1 \
>> --download-hypre=1\
>> --download-metis=1\
>> --download-ml=1   \
>> --download-netcdf=1   \
>> --download-parmetis=1 \
>> --download-triangle=1 \
>> --with-blas-lapack-dir=$MKLROOT   \
>> --with-debugging=1\
>> --with-mpi-dir=$MPI_HOME  \
>> --with-pic\
>> --with-shared-libraries=1 \
>> --with-vendor-compilers=intel \
>> --with-x11=1
>>
>> with gcc:
>> ./configure \
>> --download-exodusii=1 \
>> --download-chaco=1\
>> --download-ctetgen=1  \
>> --download-hdf5=1 \
>> --download-hypre=1\
>> --download-metis=1\
>> --download-ml=1   \
>> --download-netcdf=1   \
>> --download-parmetis=1 \
>> --download-triangle=1 \
>> --download-yaml=1 \
>> --with-debugging=1\
>> --with-shared-libraries=1 \
>> --with-x11=1
>>
>> Blaise
>>
>> On Jan 23, 2017, at 7:55 AM, Mark Adams  wrote:
>>
>> And I trust you updated all system software (like gcc, NetCDF and
>> ExodusII). OSX upgrades are hell.
>>
>> On Mon, Jan 23, 2017 at 12:36 AM, Matthew Knepley 
>> wrote:
>>
>>> On Sun, Jan 22, 2017 at 11:18 PM, Fande Kong 
>>> wrote:
>>>
 Thanks, Matt,

 Clang does not have this issue. The code runs fine with clang.

>>>
>>> Okay, it sounds like a gcc bug on Mac 10.6, or at least in the version
>>> you have.
>>>
>>>Matt
>>>
>>>
 Fande,

 On Sun, Jan 22, 2017 at 8:03 PM, Matthew Knepley 
 wrote:

> On Sun, Jan 22, 2017 at 8:40 PM, Fande Kong 
> wrote:
>
>> Thanks, Matt.
>>
>> It is a weird bug.
>>
>> Do we have an alternative solution to this? I was wondering whether
>> it is possible to read the ".exo" files without using the ExodusII. For
>> example, can we read the ".exo" files using the netcdf only?
>>
>
> Well, ExodusII is only a think layer on NetCDF, just like other
> wrappers are thin layers on HDF5. It is
> really NetCDF that is failing. Can you switch compilers and see if
> that helps?
>
>   Matt
>
>
>> Fande Kong,
>>
>>
>>
>> On Sun, Jan 22, 2017 at 6:50 PM, Matthew Knepley 
>> wrote:
>>
>>> On Sun, Jan 22, 2017 at 5:28 PM, Fande Kong 
>>> wrote:
>>>
 On Sun, Jan 22, 2017 at 12:35 PM, Matthew Knepley <
 knep...@gmail.com> wrote:

> On Sun, Jan 22, 2017 at 12:41 PM, Fande Kong 
> wrote:
>
>> On Sat, Jan 21, 2017 at 10:47 PM, Matthew Knepley <
>> knep...@gmail.com> wrote:
>>
>>> On Sat, Jan 21, 2017 at 10:38 PM, Fande Kong <
>>> fdkong...@gmail.com> wrote:
>>>
 Hi All,

 I upgraded the OS system to macOS Sierra, and observed that
 PETSc can not read the exodus file any more. The same code runs 
 fine
 on macOS Capitan. I also tested the function 
 DMPlexCreateExodusFromFile()
 against different versions of the GCC compiler such as GCC-5.4 and 
 GCC-6,
 and neither of them work. I guess this issue is related to the 
 external
 package *exodus*, and PETSc might not pick up the right
 enveriment variables for the *exodus.*

 This issue can be reproduced using the following simple code:

>>>
>>> 1) This is just a standard check. Have you reconfigured so that
>>> you know ExodusII was built with the same compilers and system 
>>> libraries?
>>>
>>> 2) If so, can you get a stack trace with gdb or lldb?
>>>
>>
>> 0

Re: [petsc-users] PetscTableCreateHashSize

2017-01-10 Thread Kong, Fande

BTW, one more question:

There are some pieces of code in #if defined(PETSC_USE_CTABLE)  #endif.
How to disable ctable? That is, make PETSC_USE_CTABLE false during
configuration.

Fande,

On Tue, Jan 10, 2017 at 12:33 AM, Jed Brown  wrote:

> Satish Balay  writes:
> > I tried looking at it - but it was easier for me to fixup current ctable
> code.
>
> I mentioned it more as a long-term thing.  I don't think we need two
> different hashtable implementations in PETSc.  I think khash is better
> and extensible, so we may as well use it in all new code and migrate
> petsctable to it when convenient.  I don't think it's urgent unless
> someone has a use case that needs it.
>

Re: [petsc-users] PetscTableCreateHashSize

2017-01-09 Thread Kong, Fande

Thanks a lot Satish!

Like Jed said, it would be better if we could come up an algorithm for
automatically computing a hash size for a given n.  Otherwise, we  may need
to  add more entries to the lookup again in the future.

Fande,

On Mon, Jan 9, 2017 at 2:14 PM, Satish Balay  wrote:

> On Mon, 9 Jan 2017, Jed Brown wrote:
>
> > Satish Balay  writes:
> > > Sure - I'm using a crappy algorithm [look-up table] to get
> > > "prime_number_close_to(1.4*sz)" - as I don't know how to generate
> > > these numbers automatically.
> >
> > FWIW, it only needs to be coprime with PETSC_HASH_FACT.
>
> Not sure I understand - are you saying coprime requirement is easier
> satisfy than a single prime?
>
> I had switched this code to use double-hasing algorithm - and the
> requirement here is - the table size be a prime number. [so I'm
> attempting to estimate a prime number suitable for the table size]
>
> I pushed the following
> https://urldefense.proofpoint.com/v2/url?u=https-3A__
> bitbucket.org_petsc_petsc_commits_d742c75fd0d514f7fa1873d5b10984
> bc3f363031=DQIBAg=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmi
> CY=nkPXHuaxZeHPzOteY25j_Dptk5XyWiqwzaJbEwI5uWY=
> eOjfGCXP3g18VLYhXY5xrlOr7AFn7o3G_YrYVo8Rw8Y=
>
> Satish
>

Re: [petsc-users] PetscTableCreateHashSize

2017-01-09 Thread Kong, Fande

Thanks, Satish,


On Mon, Jan 9, 2017 at 12:36 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> We can add more entries to the lookup. The stack below looks
> incomplete. Which routine is calling PetscTableCreateHashSize() with
> this big size?
>

call trace:

[4]PETSC ERROR: #3 MatSetUpMultiply_MPIAIJ() line 36 in
/home/schuseba/projects/64_bit_builds/petsc/src/mat/impls/aij/mpi/mmaij.c

[9]PETSC ERROR: #4 MatAssemblyEnd_MPIAIJ() line 747 in
/home/schuseba/projects/64_bit_builds/petsc/src/mat/impls/aij/mpi/mpiaij.c

[9]PETSC ERROR: #4 MatAssemblyEnd_MPIAIJ() line 747 in
/home/schuseba/projects/64_bit_builds/petsc/src/mat/impls/aij/mpi/mpiaij.c


>
> Satish
>
> ---
> $ git diff
> diff --git a/src/sys/utils/ctable.c b/src/sys/utils/ctable.c
> index cd64284..761a2c6 100644
> --- a/src/sys/utils/ctable.c
> +++ b/src/sys/utils/ctable.c
> @@ -25,6 +25,7 @@ static PetscErrorCode PetscTableCreateHashSize(PetscInt
> sz, PetscInt *hsz)
>else if (sz < 819200)  *hsz = 1193557;
>else if (sz < 1638400) *hsz = 2297059;
>else if (sz < 3276800) *hsz = 4902383;
> +  else if (sz < 6553600) *hsz = 9179113;
>else SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"A really huge
> hash is being requested.. cannot process: %D",sz);
>PetscFunctionReturn(0);
>  }
>
> On Mon, 9 Jan 2017, Kong, Fande wrote:
>
> > Hi All,
> >
> > Hash size is set manually according to the number of expected keys in the
> > function PetscTableCreateHashSize(). Any reason to restrict the
> > ``n"<3276800?
> >
> > One user here encountered an issue because of this restriction. The
> > messages are as follows:
> >
> > [3]PETSC ERROR: - Error Message
> > --
> >
> > [3]PETSC ERROR: Argument out of range
> >
> > [3]PETSC ERROR: A really huge hash is being requested.. cannot process:
> > 3497472
> >
> > [3]PETSC ERROR: See https://urldefense.proofpoint.
> com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_documentation_
> faq.html=DQIBAg=54IZrppPQZKX9mLzcGdPfFD1hxrcB_
> _aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=
> fvlOBYaS6Bzg7U320hXOmDVca3d6OkyJnp56sjG6pG8=
> Rp5eqZDYZPxEHWb7SoQwATm41rJPVIolrCKuUGdM72U=  for
> > trouble shooting.
> >
> > [3]PETSC ERROR: Petsc Release Version 3.7.4, unknown
> >
> > [3]PETSC ERROR: /home/schuseba/projects/64_bit_builds/yak/yak-opt on a
> > linux-gnu-c-opt named r3i3n0 by schuseba Fri Jan  6 23:15:37 2017
> >
> > [3]PETSC ERROR: Configure options --download-hypre=1 --with-ssl=0
> > --with-debugging=no --with-pic=1 --with-shared-libraries=1
> > --with-64-bit-indices=1 --with-cc=mpicc --with-cxx=mpicxx
> --with-fc=mpif90
> > --download-metis=1 --download-parmetis=1 --download-fblaslapack=1
> > --download-superlu_dist=1 -CC=mpicc -CXX=mpicxx -FC=mpif90 -F77=mpif77
> > -F90=mpif90 -CFLAGS="-fPIC -fopenmp" -CXXFLAGS="-fPIC -fopenmp"
> > -FFLAGS="-fPIC -fopenmp" -FCFLAGS="-fPIC -fopenmp" -F90FLAGS="-fPIC
> > -fopenmp" -F77FLAGS="-fPIC -fopenmp"
> >
> > [3]PETSC ERROR: #1 PetscTableCreateHashSize() line 28 in
> > /home/schuseba/projects/64_bit_builds/petsc/src/sys/utils/ctable.c
> >
> >
> >
> >
> >
> > Fande,
> >
>
>

[petsc-users] PetscTableCreateHashSize

2017-01-09 Thread Kong, Fande

Hi All,

Hash size is set manually according to the number of expected keys in the
function PetscTableCreateHashSize(). Any reason to restrict the
``n"<3276800?

One user here encountered an issue because of this restriction. The
messages are as follows:

[3]PETSC ERROR: - Error Message
--

[3]PETSC ERROR: Argument out of range

[3]PETSC ERROR: A really huge hash is being requested.. cannot process:
3497472

[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[3]PETSC ERROR: Petsc Release Version 3.7.4, unknown

[3]PETSC ERROR: /home/schuseba/projects/64_bit_builds/yak/yak-opt on a
linux-gnu-c-opt named r3i3n0 by schuseba Fri Jan  6 23:15:37 2017

[3]PETSC ERROR: Configure options --download-hypre=1 --with-ssl=0
--with-debugging=no --with-pic=1 --with-shared-libraries=1
--with-64-bit-indices=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
--download-metis=1 --download-parmetis=1 --download-fblaslapack=1
--download-superlu_dist=1 -CC=mpicc -CXX=mpicxx -FC=mpif90 -F77=mpif77
-F90=mpif90 -CFLAGS="-fPIC -fopenmp" -CXXFLAGS="-fPIC -fopenmp"
-FFLAGS="-fPIC -fopenmp" -FCFLAGS="-fPIC -fopenmp" -F90FLAGS="-fPIC
-fopenmp" -F77FLAGS="-fPIC -fopenmp"

[3]PETSC ERROR: #1 PetscTableCreateHashSize() line 28 in
/home/schuseba/projects/64_bit_builds/petsc/src/sys/utils/ctable.c





Fande,

[petsc-users] superlu_dist issue

2016-11-29 Thread Kong, Fande

Hi All,

I think we have been discussing this topic for a while in other threads.
But I still did not get yet. PETSc uses 'SamePattern' as the default
FactPattern. Some test cases in MOOSE fail with this default option, but I
can make these tests pass if I set the FactPattern as
'SamePattern_SameRowPerm' by using -mat_superlu_dist_fact
SamePattern_SameRowPerm.

Does this make sense mathematically? I can not understand.  'SamePattern'
should  be more general than 'SamePattern_SameRowPerm'. In other words, if
something works with 'SamePattern_SameRowPerm', it definitely should work
with 'SamePattern' too.

Thanks as always.

Fande,

Re: [petsc-users] GAMG

2016-10-31 Thread Kong, Fande

On Mon, Oct 31, 2016 at 8:44 AM, Jed Brown  wrote:

> Jeremy Theler  writes:
>
> > Hi again
> >
> > I have been wokring on these issues. Long story short: it is about the
> > ordering of the unknown fields in the vector.
> >
> > Long story:
> > The physics is linear elastic problem, you can see it does work with LU
> > over a simple cube (warp the displacements to see it does represent an
> > elastic problem, E=200e3, nu=0.3):
> >
> > https://caeplex.com/demo/results.php?id=5817146bdb561
> >
> >
> > Say my three displacements (unknowns) are u,v,w. I can define the
> > unknown vector as (is this called node-based ordering?)
> >
> > [u1 v1 w1 u2 v2 w2 ... un vn wn]^T
> >
> > Another option is (is this called unknown-based ordering?)
> >
> > [u1 u2 ... un v1 v2 ... vn w1 w2 ... wn]^T
> >
> >
> > With lu/preonly the results are the same, although the stiffnes matrixes
> > for each case are attached as PNGs. And of course, the near-nullspace
> > vectors are different. So PCSetCoordinates() should work with one
> > ordering and not with another one, an issue I did not take into
> > consideration.
> >
> > After understanding Matt's point about the near nullspace (and reading
> > some interesting comments from Jed on scicomp stackexchange) I did built
> > my own vectors (I had to take a look at MatNullSpaceCreateRigidBody()
> > because I found out by running the code the nullspace should be an
> > orthonormal basis, it should say so in the docs).
>
> Where?
>
> "vecs   - the vectors that span the null space (excluding the constant
> vector); these vectors must be orthonormal."
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
> MatNullSpaceCreate.html
>
> And if you run in debug mode (default), as you always should until you
> are confident that your code is correct, MatNullSpaceCreate tests that
> your vectors are orthonormal.
>
> > Now, there are some results I do not understand. I tried these six
> > combinations:
> >
> > order  near-nullspace   iterationsnorm
> > -  --   --
> > unknownexplicit 101.6e-6
> > unknownPCSetCoordinates 151.7e-7
> > unknownnone 152.4e-7
> > node   explicit fails with error -11
> > node   PCSetCoordinates fails with error -11
> > node   none 133.8e-7
>
> Did you set a block size for the "node-based" orderings?  Are you sure
> the above is labeled correctly?  Anyway, PCSetCoordinates uses
> "node-based" ordering.  Implementation performance will generally be
> better with node-based ordering -- it has better memory streaming and
> cache behavior.
>
> The AIJ matrix format will also automatically do an "inode" optimization
> to reduce memory bandwidth and enable block smoothing (default
> configuration uses SOR smoothing).  You can use -mat_no_inode to try
> turning that off.
>
> > Error -11 is
> > PETSc's linear solver did not converge with reason
> > 'DIVERGED_PCSETUP_FAILED' (-11)
>
> Isn't there an actual error message?
>
> > Any explanation (for dumbs)?
> > Another thing to take into account: I am setting the dirichlet BCs with
> > MatZeroRows(), but I am not updating the columns to keep symmetry. Can
> > this pose a problem for GAMG?
>
> Usually minor, but it is better to maintain symmetry.
>

If the boundary values are not zero, no way to maintain symmetry unless we
reduce the extra part of  the matrix. Not updating the columns is better in
this situation.

Fande,

Re: [petsc-users] question

2016-10-24 Thread Kong, Fande

Using -snes_linesearch_type  basic to turn off the line search, you
will see that the number of function evaluations is the same as the number
of Newton iterations.

Fande,

On Mon, Oct 24, 2016 at 2:13 PM, Gideon Simpson 
wrote:

> I just mean that if I were working a Newton iteration by hand, i.e.,
>
> x_{n+1} = x_n - J^{-1} F(x_n),
>
> I’d be able to count the number of Newton iterations.  I’m trying to see
> how that count would relate to the numbers reported by snes_view.  I’m
> guessing that -snes_monitor is giving a more consistent count of this?
>
>
> -gideon
>
> On Oct 24, 2016, at 4:11 PM, Jed Brown  wrote:
>
> Gideon Simpson  writes:
>
> Ok, so if I’m doing the default Newton Line Search, how would I interpret
> the 5 and the 20, vis a vis what I would be doing with pencil and paper?
>
>
> I don't know what you're doing with pencil and paper.  It's just
> counting the number of residual evaluations and solver iterations
> (Jacobian and preconditioner application).  Use -snes_monitor
> -snes_linesearch_monitor -ksp_monitor for the details.
>
>
>

Re: [petsc-users] question

2016-10-24 Thread Kong, Fande

If you are using the matrix-free method, the number of  function
evaluations is way more  than the number of Newton iterations.

Fande,

On Mon, Oct 24, 2016 at 2:01 PM, Justin Chang  wrote:

> Sorry forgot to hit reply all
>
> On Monday, October 24, 2016, Justin Chang  wrote:
>
>> It depends on your SNES solver. A SNES iteration could involve more than
>> one function evaluation (e.g., line searching). Also, -snes_monitor may say
>> 3 iterations whereas -snes_view might indicate 4 function evaluations which
>> could suggest that the first call was for computing the initial residual.
>>
>> On Mon, Oct 24, 2016 at 2:22 PM, Gideon Simpson > > wrote:
>>
>>> I notice that if I use -snes_view,
>>>
>>> I see lines like:
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=5
>>> Just to clarify, the number of "function evaluations" corresponds to the
>>> number of Newton (or Newton like) steps, and the total "number of linear
>>> solver iterations” is the total number of iterations needed to solve the
>>> linear problem at each Newton iteration.  Is that correct?  So in the
>>> above, there are 5 steps of Newton and a total of 20 iterations of the
>>> linear solver across all 5 Newton steps.
>>>
>>> -gideon
>>>
>>>
>>

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Kong, Fande

On Mon, Oct 24, 2016 at 8:07 AM, Kong, Fande <fande.k...@inl.gov> wrote:

>
>
> On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>
>>
>>Thanks Satish,
>>
>>   I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant
>> (in next for testing)
>>
>> Fande,
>>
>> This will also make MatMPIAIJSetPreallocation() work properly
>> with multiple calls (you will not need a MatReset()).
>>
>

Does this work for MPIAIJ only? There are also other functions:
MatSeqAIJSetPreallocation(), MatMPIAIJSetPreallocation(),
MatSeqBAIJSetPreallocation(), MatMPIBAIJSetPreallocation(),
MatSeqSBAIJSetPreallocation(), MatMPISBAIJSetPreallocation(), and
MatXAIJSetPreallocation.

We have to use different function for different type. Could we have an
unified-interface for all of them?

Fande,


>
>>Barry
>>
>
> Thanks, Barry.
>
> Fande,
>
>
>>
>>
>> > On Oct 21, 2016, at 6:48 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>> >
>> > On Fri, 21 Oct 2016, Barry Smith wrote:
>> >
>> >>
>> >>  valgrind first
>> >
>> > balay@asterix /home/balay/download-pine/x/superlu_dist_test
>> > $ mpiexec -n 2 $VG ./ex16 -f ~/datafiles/matrices/small
>> > First MatLoad!
>> > Mat Object: 2 MPI processes
>> >  type: mpiaij
>> > row 0: (0, 4.)  (1, -1.)  (6, -1.)
>> > row 1: (0, -1.)  (1, 4.)  (2, -1.)  (7, -1.)
>> > row 2: (1, -1.)  (2, 4.)  (3, -1.)  (8, -1.)
>> > row 3: (2, -1.)  (3, 4.)  (4, -1.)  (9, -1.)
>> > row 4: (3, -1.)  (4, 4.)  (5, -1.)  (10, -1.)
>> > row 5: (4, -1.)  (5, 4.)  (11, -1.)
>> > row 6: (0, -1.)  (6, 4.)  (7, -1.)  (12, -1.)
>> > row 7: (1, -1.)  (6, -1.)  (7, 4.)  (8, -1.)  (13, -1.)
>> > row 8: (2, -1.)  (7, -1.)  (8, 4.)  (9, -1.)  (14, -1.)
>> > row 9: (3, -1.)  (8, -1.)  (9, 4.)  (10, -1.)  (15, -1.)
>> > row 10: (4, -1.)  (9, -1.)  (10, 4.)  (11, -1.)  (16, -1.)
>> > row 11: (5, -1.)  (10, -1.)  (11, 4.)  (17, -1.)
>> > row 12: (6, -1.)  (12, 4.)  (13, -1.)  (18, -1.)
>> > row 13: (7, -1.)  (12, -1.)  (13, 4.)  (14, -1.)  (19, -1.)
>> > row 14: (8, -1.)  (13, -1.)  (14, 4.)  (15, -1.)  (20, -1.)
>> > row 15: (9, -1.)  (14, -1.)  (15, 4.)  (16, -1.)  (21, -1.)
>> > row 16: (10, -1.)  (15, -1.)  (16, 4.)  (17, -1.)  (22, -1.)
>> > row 17: (11, -1.)  (16, -1.)  (17, 4.)  (23, -1.)
>> > row 18: (12, -1.)  (18, 4.)  (19, -1.)  (24, -1.)
>> > row 19: (13, -1.)  (18, -1.)  (19, 4.)  (20, -1.)  (25, -1.)
>> > row 20: (14, -1.)  (19, -1.)  (20, 4.)  (21, -1.)  (26, -1.)
>> > row 21: (15, -1.)  (20, -1.)  (21, 4.)  (22, -1.)  (27, -1.)
>> > row 22: (16, -1.)  (21, -1.)  (22, 4.)  (23, -1.)  (28, -1.)
>> > row 23: (17, -1.)  (22, -1.)  (23, 4.)  (29, -1.)
>> > row 24: (18, -1.)  (24, 4.)  (25, -1.)  (30, -1.)
>> > row 25: (19, -1.)  (24, -1.)  (25, 4.)  (26, -1.)  (31, -1.)
>> > row 26: (20, -1.)  (25, -1.)  (26, 4.)  (27, -1.)  (32, -1.)
>> > row 27: (21, -1.)  (26, -1.)  (27, 4.)  (28, -1.)  (33, -1.)
>> > row 28: (22, -1.)  (27, -1.)  (28, 4.)  (29, -1.)  (34, -1.)
>> > row 29: (23, -1.)  (28, -1.)  (29, 4.)  (35, -1.)
>> > row 30: (24, -1.)  (30, 4.)  (31, -1.)
>> > row 31: (25, -1.)  (30, -1.)  (31, 4.)  (32, -1.)
>> > row 32: (26, -1.)  (31, -1.)  (32, 4.)  (33, -1.)
>> > row 33: (27, -1.)  (32, -1.)  (33, 4.)  (34, -1.)
>> > row 34: (28, -1.)  (33, -1.)  (34, 4.)  (35, -1.)
>> > row 35: (29, -1.)  (34, -1.)  (35, 4.)
>> > Second MatLoad!
>> > Mat Object: 2 MPI processes
>> >  type: mpiaij
>> > ==4592== Invalid read of size 4
>> > ==4592==at 0x5814014: MatView_MPIAIJ_ASCIIorDraworSocket
>> (mpiaij.c:1402)
>> > ==4592==by 0x5814A75: MatView_MPIAIJ (mpiaij.c:1440)
>> > ==4592==by 0x53373D7: MatView (matrix.c:989)
>> > ==4592==by 0x40107E: main (ex16.c:30)
>> > ==4592==  Address 0xa47b460 is 20 bytes after a block of size 28 alloc'd
>> > ==4592==at 0x4C2FF83: memalign (vg_replace_malloc.c:858)
>> > ==4592==by 0x4FD121A: PetscMallocAlign (mal.c:28)
>> > ==4592==by 0x5842C70: MatSetUpMultiply_MPIAIJ (mmaij.c:41)
>> > ==4592==by 0x5809943: MatAssemblyEnd_MPIAIJ (mpiaij.c:747)
>> > ==4592==by 0x536B299: MatAssemblyEnd (matrix.c:5298)
>> > ==4592==by 0x5829C05: MatLoad_MPIAIJ (mpiaij.c:3032)
>> > ==4592==by 0x5337FEA: MatLoad (matrix.c:1101)
>> > ==4592==by 0x400D9F: main (ex16.c:22)
>> > ==4592==
&

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Kong, Fande

On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith  wrote:

>
>Thanks Satish,
>
>   I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant
> (in next for testing)
>
> Fande,
>
> This will also make MatMPIAIJSetPreallocation() work properly with
> multiple calls (you will not need a MatReset()).
>
>Barry
>

Thanks, Barry.

Fande,


>
>
> > On Oct 21, 2016, at 6:48 PM, Satish Balay  wrote:
> >
> > On Fri, 21 Oct 2016, Barry Smith wrote:
> >
> >>
> >>  valgrind first
> >
> > balay@asterix /home/balay/download-pine/x/superlu_dist_test
> > $ mpiexec -n 2 $VG ./ex16 -f ~/datafiles/matrices/small
> > First MatLoad!
> > Mat Object: 2 MPI processes
> >  type: mpiaij
> > row 0: (0, 4.)  (1, -1.)  (6, -1.)
> > row 1: (0, -1.)  (1, 4.)  (2, -1.)  (7, -1.)
> > row 2: (1, -1.)  (2, 4.)  (3, -1.)  (8, -1.)
> > row 3: (2, -1.)  (3, 4.)  (4, -1.)  (9, -1.)
> > row 4: (3, -1.)  (4, 4.)  (5, -1.)  (10, -1.)
> > row 5: (4, -1.)  (5, 4.)  (11, -1.)
> > row 6: (0, -1.)  (6, 4.)  (7, -1.)  (12, -1.)
> > row 7: (1, -1.)  (6, -1.)  (7, 4.)  (8, -1.)  (13, -1.)
> > row 8: (2, -1.)  (7, -1.)  (8, 4.)  (9, -1.)  (14, -1.)
> > row 9: (3, -1.)  (8, -1.)  (9, 4.)  (10, -1.)  (15, -1.)
> > row 10: (4, -1.)  (9, -1.)  (10, 4.)  (11, -1.)  (16, -1.)
> > row 11: (5, -1.)  (10, -1.)  (11, 4.)  (17, -1.)
> > row 12: (6, -1.)  (12, 4.)  (13, -1.)  (18, -1.)
> > row 13: (7, -1.)  (12, -1.)  (13, 4.)  (14, -1.)  (19, -1.)
> > row 14: (8, -1.)  (13, -1.)  (14, 4.)  (15, -1.)  (20, -1.)
> > row 15: (9, -1.)  (14, -1.)  (15, 4.)  (16, -1.)  (21, -1.)
> > row 16: (10, -1.)  (15, -1.)  (16, 4.)  (17, -1.)  (22, -1.)
> > row 17: (11, -1.)  (16, -1.)  (17, 4.)  (23, -1.)
> > row 18: (12, -1.)  (18, 4.)  (19, -1.)  (24, -1.)
> > row 19: (13, -1.)  (18, -1.)  (19, 4.)  (20, -1.)  (25, -1.)
> > row 20: (14, -1.)  (19, -1.)  (20, 4.)  (21, -1.)  (26, -1.)
> > row 21: (15, -1.)  (20, -1.)  (21, 4.)  (22, -1.)  (27, -1.)
> > row 22: (16, -1.)  (21, -1.)  (22, 4.)  (23, -1.)  (28, -1.)
> > row 23: (17, -1.)  (22, -1.)  (23, 4.)  (29, -1.)
> > row 24: (18, -1.)  (24, 4.)  (25, -1.)  (30, -1.)
> > row 25: (19, -1.)  (24, -1.)  (25, 4.)  (26, -1.)  (31, -1.)
> > row 26: (20, -1.)  (25, -1.)  (26, 4.)  (27, -1.)  (32, -1.)
> > row 27: (21, -1.)  (26, -1.)  (27, 4.)  (28, -1.)  (33, -1.)
> > row 28: (22, -1.)  (27, -1.)  (28, 4.)  (29, -1.)  (34, -1.)
> > row 29: (23, -1.)  (28, -1.)  (29, 4.)  (35, -1.)
> > row 30: (24, -1.)  (30, 4.)  (31, -1.)
> > row 31: (25, -1.)  (30, -1.)  (31, 4.)  (32, -1.)
> > row 32: (26, -1.)  (31, -1.)  (32, 4.)  (33, -1.)
> > row 33: (27, -1.)  (32, -1.)  (33, 4.)  (34, -1.)
> > row 34: (28, -1.)  (33, -1.)  (34, 4.)  (35, -1.)
> > row 35: (29, -1.)  (34, -1.)  (35, 4.)
> > Second MatLoad!
> > Mat Object: 2 MPI processes
> >  type: mpiaij
> > ==4592== Invalid read of size 4
> > ==4592==at 0x5814014: MatView_MPIAIJ_ASCIIorDraworSocket
> (mpiaij.c:1402)
> > ==4592==by 0x5814A75: MatView_MPIAIJ (mpiaij.c:1440)
> > ==4592==by 0x53373D7: MatView (matrix.c:989)
> > ==4592==by 0x40107E: main (ex16.c:30)
> > ==4592==  Address 0xa47b460 is 20 bytes after a block of size 28 alloc'd
> > ==4592==at 0x4C2FF83: memalign (vg_replace_malloc.c:858)
> > ==4592==by 0x4FD121A: PetscMallocAlign (mal.c:28)
> > ==4592==by 0x5842C70: MatSetUpMultiply_MPIAIJ (mmaij.c:41)
> > ==4592==by 0x5809943: MatAssemblyEnd_MPIAIJ (mpiaij.c:747)
> > ==4592==by 0x536B299: MatAssemblyEnd (matrix.c:5298)
> > ==4592==by 0x5829C05: MatLoad_MPIAIJ (mpiaij.c:3032)
> > ==4592==by 0x5337FEA: MatLoad (matrix.c:1101)
> > ==4592==by 0x400D9F: main (ex16.c:22)
> > ==4592==
> > ==4591== Invalid read of size 4
> > ==4591==at 0x5814014: MatView_MPIAIJ_ASCIIorDraworSocket
> (mpiaij.c:1402)
> > ==4591==by 0x5814A75: MatView_MPIAIJ (mpiaij.c:1440)
> > ==4591==by 0x53373D7: MatView (matrix.c:989)
> > ==4591==by 0x40107E: main (ex16.c:30)
> > ==4591==  Address 0xa482958 is 24 bytes before a block of size 7 alloc'd
> > ==4591==at 0x4C2FF83: memalign (vg_replace_malloc.c:858)
> > ==4591==by 0x4FD121A: PetscMallocAlign (mal.c:28)
> > ==4591==by 0x4F31FB5: PetscStrallocpy (str.c:197)
> > ==4591==by 0x4F0D3F5: PetscClassRegLogRegister (classlog.c:253)
> > ==4591==by 0x4EF96E2: PetscClassIdRegister (plog.c:2053)
> > ==4591==by 0x51FA018: VecInitializePackage (dlregisvec.c:165)
> > ==4591==by 0x51F6DE9: VecCreate (veccreate.c:35)
> > ==4591==by 0x51C49F0: VecCreateSeq (vseqcr.c:37)
> > ==4591==by 0x5843191: MatSetUpMultiply_MPIAIJ (mmaij.c:104)
> > ==4591==by 0x5809943: MatAssemblyEnd_MPIAIJ (mpiaij.c:747)
> > ==4591==by 0x536B299: MatAssemblyEnd (matrix.c:5298)
> > ==4591==by 0x5829C05: MatLoad_MPIAIJ (mpiaij.c:3032)
> > ==4591==by 0x5337FEA: MatLoad (matrix.c:1101)
> > ==4591==by 0x400D9F: main (ex16.c:22)
> > ==4591==
> > [0]PETSC ERROR: - Error Message
>

[petsc-users] matrix preallocation

2016-10-21 Thread Kong, Fande

Hi,

For mechanics problems, the contact surface changes  during each nonlinear
iteration. Therefore, the sparsity of matrix also changes during each
nonlinear iteration. We know the preallocaiton is important for performance.

My question is:  it is possible to re-allocate memory during each nonlinear
iteration?

Fande

Re: [petsc-users] Matrix is missing diagonal entry

2016-10-18 Thread Kong, Fande

Thanks, Hong and Jed.

I am going to explicitly add a few zeros into the matrix.


Regards,

Fande,

On Tue, Oct 18, 2016 at 9:46 AM, Hong <hzh...@mcs.anl.gov> wrote:

> You need set 0.0 to the diagonals.
> Diagonal storage is used in PETSc library.
>
> Hong
>
>
> On Tue, Oct 18, 2016 at 10:11 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>> Hi Developers,
>>
>> Any reason to force users provide a matrix which does not miss any
>> diagonal entries when using a LU-type solver?
>>
>> Sometime, it is impossible to have all diagonal entries in a matrix, that
>> is, the matrix has to miss some diagonal entries. For example, there is a
>> saddle-point matrix from the discretization of incomprehensible equations,
>> and the lower part of the matrix is a zero block. The matrix usually looks
>> like:
>>
>> | A   B^T  |
>> | B0 |
>>
>>
>>
>>
>>
>> [56]PETSC ERROR: Object is in wrong state
>> [56]PETSC ERROR: Matrix is missing diagonal entry 33
>> [56]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_documentation_faq.html=CwMFaQ=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY=rGGzwQ3YeqAfah8-uPSZwxM_Rf91WEU1UKQesi8UMi0=MvjUyc6lhioQlvGYpWhxtc8rWrFV4gbQjYOmBYlIyiA=>
>> for trouble shooting.
>> [56]PETSC ERROR: Petsc Release Version 3.6.2, unknown
>> [56]PETSC ERROR: ./fluid on a arch-linux2-cxx-opt named ys0755 by fandek
>> Mon Oct 17 17:06:08 2016
>> [56]PETSC ERROR: Configure options --with-clanguage=cxx
>> --with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1
>> --download-parmetis=1 --download-metis=1 --with-netcdf=1
>> --download-exodusii=1 --with-hdf5=1 --with-debugging=no --with-c2html=0
>> --with-64-bit-indices=1 --download-hypre=1 --download-superlu_dist=1
>> [56]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1729 in
>> /petsc_installed/petsc/src/mat/impls/aij/seq/aijfact.c
>> [56]PETSC ERROR: #2 MatILUFactorSymbolic() line 6457 in
>> /petsc_installed/petsc/src/mat/interface/matrix.c
>> [56]PETSC ERROR: #3 PCSetUp_ILU() line 204 in
>> /petsc_installed/petsc/src/ksp/pc/impls/factor/ilu/ilu.c
>> [56]PETSC ERROR: #4 PCSetUp() line 983 in /petsc_installed/petsc/src/ksp
>> /pc/interface/precon.c
>> [56]PETSC ERROR: #5 KSPSetUp() line 332 in /petsc_installed/petsc/src/ksp
>> /ksp/interface/itfunc.c
>> [56]PETSC ERROR: #6 PCSetUpOnBlocks_ASM() line 405 in
>> /petsc_installed/petsc/src/ksp/pc/impls/asm/asm.c
>> [56]PETSC ERROR: #7 PCSetUpOnBlocks() line 1016 in
>> /petsc_installed/petsc/src/ksp/pc/interface/precon.c
>> [56]PETSC ERROR: #8 KSPSetUpOnBlocks() line 167 in
>> /petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
>> [56]PETSC ERROR: #9 KSPSolve() line 552 in /petsc_installed/petsc/src/ksp
>> /ksp/interface/itfunc.c
>> [56]PETSC ERROR: #10 PCApply_LSC() line 83 in
>> /petsc_installed/petsc/src/ksp/pc/impls/lsc/lsc.c
>> [56]PETSC ERROR: #11 PCApply() line 483 in /petsc_installed/petsc/src/ksp
>> /pc/interface/precon.c
>> [56]PETSC ERROR: #12 KSP_PCApply() line 242 in
>> /petsc_installed/petsc/include/petsc/private/kspimpl.h
>> [56]PETSC ERROR: #13 KSPSolve_PREONLY() line 26 in
>> /petsc_installed/petsc/src/ksp/ksp/impls/preonly/preonly.c
>> [56]PETSC ERROR: #14 KSPSolve() line 604 in /petsc_installed/petsc/src/ksp
>> /ksp/interface/itfunc.c
>> [56]PETSC ERROR: #15 PCApply_FieldSplit_Schur() line 904 in
>> /petsc_installed/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c
>> [56]PETSC ERROR: #16 PCApply() line 483 in /petsc_installed/petsc/src/ksp
>> /pc/interface/precon.c
>> [56]PETSC ERROR: #17 KSP_PCApply() line 242 in
>> /petsc_installed/petsc/include/petsc/private/kspimpl.h
>> [56]PETSC ERROR: #18 KSPInitialResidual() line 63 in
>> /petsc_installed/petsc/src/ksp/ksp/interface/itres.c
>> [56]PETSC ERROR: #19 KSPSolve_GMRES() line 235 in
>> /petsc_installed/petsc/src/ksp/ksp/impls/gmres/gmres.c
>> [56]PETSC ERROR: #20 KSPSolve() line 604 in /petsc_installed/petsc/src/ksp
>> /ksp/interface/itfunc.c
>> [56]PETSC ERROR: #21 SNESSolve_NEWTONLS() line 233 in
>> /petsc_installed/petsc/src/snes/impls/ls/ls.c
>> [56]PETSC ERROR: #22 SNESSolve() line 3906 in
>> /petsc_installed/petsc/src/snes/interface/snes.c
>>
>>
>> Thanks,
>>
>> Fande Kong,
>>
>
>

[petsc-users] Matrix is missing diagonal entry

2016-10-18 Thread Kong, Fande

Hi Developers,

Any reason to force users provide a matrix which does not miss any diagonal
entries when using a LU-type solver?

Sometime, it is impossible to have all diagonal entries in a matrix, that
is, the matrix has to miss some diagonal entries. For example, there is a
saddle-point matrix from the discretization of incomprehensible equations,
and the lower part of the matrix is a zero block. The matrix usually looks
like:

| A   B^T  |
| B0 |





[56]PETSC ERROR: Object is in wrong state
[56]PETSC ERROR: Matrix is missing diagonal entry 33
[56]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[56]PETSC ERROR: Petsc Release Version 3.6.2, unknown
[56]PETSC ERROR: ./fluid on a arch-linux2-cxx-opt named ys0755 by fandek
Mon Oct 17 17:06:08 2016
[56]PETSC ERROR: Configure options --with-clanguage=cxx
--with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1
--download-parmetis=1 --download-metis=1 --with-netcdf=1
--download-exodusii=1 --with-hdf5=1 --with-debugging=no --with-c2html=0
--with-64-bit-indices=1 --download-hypre=1 --download-superlu_dist=1
[56]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1729 in
/petsc_installed/petsc/src/mat/impls/aij/seq/aijfact.c
[56]PETSC ERROR: #2 MatILUFactorSymbolic() line 6457 in
/petsc_installed/petsc/src/mat/interface/matrix.c
[56]PETSC ERROR: #3 PCSetUp_ILU() line 204 in
/petsc_installed/petsc/src/ksp/pc/impls/factor/ilu/ilu.c
[56]PETSC ERROR: #4 PCSetUp() line 983 in
/petsc_installed/petsc/src/ksp/pc/interface/precon.c
[56]PETSC ERROR: #5 KSPSetUp() line 332 in
/petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
[56]PETSC ERROR: #6 PCSetUpOnBlocks_ASM() line 405 in
/petsc_installed/petsc/src/ksp/pc/impls/asm/asm.c
[56]PETSC ERROR: #7 PCSetUpOnBlocks() line 1016 in
/petsc_installed/petsc/src/ksp/pc/interface/precon.c
[56]PETSC ERROR: #8 KSPSetUpOnBlocks() line 167 in
/petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
[56]PETSC ERROR: #9 KSPSolve() line 552 in
/petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
[56]PETSC ERROR: #10 PCApply_LSC() line 83 in
/petsc_installed/petsc/src/ksp/pc/impls/lsc/lsc.c
[56]PETSC ERROR: #11 PCApply() line 483 in
/petsc_installed/petsc/src/ksp/pc/interface/precon.c
[56]PETSC ERROR: #12 KSP_PCApply() line 242 in
/petsc_installed/petsc/include/petsc/private/kspimpl.h
[56]PETSC ERROR: #13 KSPSolve_PREONLY() line 26 in
/petsc_installed/petsc/src/ksp/ksp/impls/preonly/preonly.c
[56]PETSC ERROR: #14 KSPSolve() line 604 in
/petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
[56]PETSC ERROR: #15 PCApply_FieldSplit_Schur() line 904 in
/petsc_installed/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c
[56]PETSC ERROR: #16 PCApply() line 483 in
/petsc_installed/petsc/src/ksp/pc/interface/precon.c
[56]PETSC ERROR: #17 KSP_PCApply() line 242 in
/petsc_installed/petsc/include/petsc/private/kspimpl.h
[56]PETSC ERROR: #18 KSPInitialResidual() line 63 in
/petsc_installed/petsc/src/ksp/ksp/interface/itres.c
[56]PETSC ERROR: #19 KSPSolve_GMRES() line 235 in
/petsc_installed/petsc/src/ksp/ksp/impls/gmres/gmres.c
[56]PETSC ERROR: #20 KSPSolve() line 604 in
/petsc_installed/petsc/src/ksp/ksp/interface/itfunc.c
[56]PETSC ERROR: #21 SNESSolve_NEWTONLS() line 233 in
/petsc_installed/petsc/src/snes/impls/ls/ls.c
[56]PETSC ERROR: #22 SNESSolve() line 3906 in
/petsc_installed/petsc/src/snes/interface/snes.c


Thanks,

Fande Kong,

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-17 Thread Kong, Fande

On Thu, Oct 13, 2016 at 8:21 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Fande,
>
>   What SNES method are you using?  If you use SNESKSPONLY I think it is
> ok, it will solve for the norm minimizing least square solution during the
> one KSPSolve() and then return.
>

The problem we are currently working on is a linear problem, but it could
be extended to be nonlinear.  Yes, you are right. "ksponly" indeed works,
and returns the right solution. But the norm of residual still could
confuse users because it is not close to zero.



>
>   Yes, if you use SNESNEWTONLS or others though the SNES solver will, as
> you say, think that progress has not been made.
>
>I do not like what you propose to do, changing the right hand side of
> the system the user provides is a nasty and surprising side effect.
>

I do not like this way either. The reason I posted this code here is that I
want to let you know what are inconsistent  between the nonlinear solvers
and the linear solvers.



>
> What is your goal? To make it look like the SNES system has had a
> residual norm reduction?
>

Yes, I would like to make SNES have a residual reduction.  Possibly, we
could add something in the converged_test function? For example, the
residual vector is temporarily subtracted when evaluating the residual
norm  if the system has a null space?



>
>We could generalize you question and ask what about solving for
> nonlinear problems: find the minimal norm solution of min_x || F(x) - b||.
> This may or may not belong in Tao, currently SNES doesn't do any kind of
> nonlinear least squares.
>


It would be great, if we could add this kind of solvers. Tao does have one,
I think.  I would like  to contribute something like this latter (of
course, if you are ok with this algorithm), when we are moving  to
nonlinear problems in our applications.

Fande Kong,


>
>   Barry
>
>
> > On Oct 13, 2016, at 5:20 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> > One more question.
> >
> > Suppose that we are solving the singular linear system Ax = b. N(A) is
> the null space of A, and N(A^T) is the null space of the transpose of A.
> >
> > The linear system is solved using SNES, that is, F(x) = Ax-b = Ax -b_r -
> b_n. Here b_n  in N(A^T),  and b_r in R(A).  During each nonlinear
> iteration, a linear system A \delta x = F(x) is solved. N(A) is applied to
> Krylov space  during the linear iterating. Before the actual solve
> "(*ksp->ops->solve)(ksp)" for \delta x,  a temporary copy of F(x) is made,
> F_tmp. N(A^T) is applied to F_tmp. We will get a \delta x.  F(x+\delta x )
> = A(x+\delta x)-b_r - b_n.
> >
> > F(x+\delta x ) always contain the vector b_n, and then the algorithm
> never converges because the normal of F is at least 1.
> >
> > Should we apply N(A^T) to F instead of F_tmp so that b_n can be removed
> from F?
> >
> > MatGetTransposeNullSpace(pmat,);
> > if (nullsp) {
> >VecDuplicate(ksp->vec_rhs,);
> >VecCopy(ksp->vec_rhs,btmp);
> >MatNullSpaceRemove(nullsp,btmp);
> >vec_rhs  = ksp->vec_rhs;
> >ksp->vec_rhs = btmp;
> > }
> >
> > should  be changed to
> >
> > MatGetTransposeNullSpace(pmat,);
> > if (nullsp) {
> >MatNullSpaceRemove(nullsp,ksp->vec_rhs);
> > }
> > ???
> >
> > Or other solutions to this issue?
> >
> >
> > Fande Kong,
> >
> >
> >
> >
> >
> > On Thu, Oct 13, 2016 at 8:23 AM, Matthew Knepley <knep...@gmail.com>
> wrote:
> > On Thu, Oct 13, 2016 at 9:06 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> > On Wed, Oct 12, 2016 at 10:21 PM, Jed Brown <j...@jedbrown.org> wrote:
> > Barry Smith <bsm...@mcs.anl.gov> writes:
> > >   I would make that a separate routine that the users would call first.
> >
> > We have VecMDot and VecMAXPY.  I would propose adding
> >
> >   VecQR(PetscInt nvecs,Vec *vecs,PetscScalar *R);
> >
> > (where R can be NULL).
> >
> > What does R mean here?
> >
> > It means the coefficients of the old basis vectors in the new basis.
> >
> >   Matt
> >
> > If nobody working on this, I will be going to take a try.
> >
> > Fande,
> >
> >
> > Does anyone use the "Vecs" type?
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
>
>

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-17 Thread Kong, Fande

Hi Barry,

Thanks so much for this work. I will checkout your branch, and take a look.

Thanks again!

Fande Kong,

On Thu, Oct 13, 2016 at 8:10 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Fande,
>
>I have done some work, mostly understanding and documentation, on
> handling singular systems with KSP in the branch 
> barry/improve-matnullspace-usage.
> This also includes a new example that solves both a symmetric example and
> an example where nullspace(A) != nullspace(A') src/ksp/ksp/examples/
> tutorials/ex67.c
>
>My understanding is now documented in the manual page for KSPSolve(),
> part of this is quoted below:
>
> ---
>If you provide a matrix that has a MatSetNullSpace() and
> MatSetTransposeNullSpace() this will use that information to solve singular
> systems
>in the least squares sense with a norm minimizing solution.
> $
> $   A x = b   where b = b_p + b_t where b_t is not in the
> range of A (and hence by the fundamental theorem of linear algebra is in
> the nullspace(A') see MatSetNullSpace()
> $
> $KSP first removes b_t producing the linear system  A x = b_p (which
> has multiple solutions) and solves this to find the ||x|| minimizing
> solution (and hence
> $it finds the solution x orthogonal to the nullspace(A). The algorithm
> is simply in each iteration of the Krylov method we remove the nullspace(A)
> from the search
> $direction thus the solution which is a linear combination of the
> search directions has no component in the nullspace(A).
> $
> $We recommend always using GMRES for such singular systems.
> $If nullspace(A) = nullspace(A') (note symmetric matrices always
> satisfy this property) then both left and right preconditioning will work
> $If nullspace(A) != nullspace(A') then left preconditioning will work
> but right preconditioning may not work (or it may).
>
>Developer Note: The reason we cannot always solve  nullspace(A) !=
> nullspace(A') systems with right preconditioning is because we need to
> remove at each iteration
>the nullspace(AB) from the search direction. While we know the
> nullspace(A) the nullspace(AB) equals B^-1 times the nullspace(A) but
> except for trivial preconditioners
>such as diagonal scaling we cannot apply the inverse of the
> preconditioner to a vector and thus cannot compute the nullspace(AB).
> --
>
> Any feed back on the correctness or clarity of the material is
> appreciated. The punch line is that right preconditioning cannot be trusted
> with nullspace(A) != nullspace(A') I don't see any fix for this.
>
>   Barry
>
>
>
> > On Oct 11, 2016, at 3:04 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Tue, Oct 11, 2016 at 12:18 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >
> > > On Oct 11, 2016, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > >
> > >
> > > On Tue, Oct 11, 2016 at 10:39 AM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > > > On Oct 11, 2016, at 9:33 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Barry, Thanks so much for your explanation. It helps me a lot.
> > > >
> > > > On Mon, Oct 10, 2016 at 4:00 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > > >
> > > > > On Oct 10, 2016, at 4:01 PM, Kong, Fande <fande.k...@inl.gov>
> wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > I know how to remove the null spaces from a singular system using
> creating a MatNullSpace and attaching it to Mat.
> > > > >
> > > > > I was really wondering what is the philosophy behind this? The
> exact algorithms we are using in PETSc right now?  Where we are dealing
> with this, preconditioner, linear solver, or nonlinear solver?
> > > >
> > > >It is in the Krylov solver.
> > > >
> > > >The idea is very simple. Say you have   a singular A with null
> space N (that all values Ny are in the null space of A. So N is tall and
> skinny) and you want to solve A x = b where b is in the range of A. This
> problem has an infinite number of solutions Ny + x*  since A (Ny + x*)
> = ANy + Ax* = Ax* = b where x* is the "minimum norm solution; that is Ax* =
> b and x* has the smallest norm of all solutions.
> > > >
> > > >   With left preconditioning   B A x = B b GMRES, for example,
> normally computes the solution in the as alpha_1 Bb   + alpha_2 BABb +
> alpha_3 BABABAb +   but the B operator will lik

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-13 Thread Kong, Fande

One more question.

Suppose that we are solving the singular linear system Ax = b. N(A) is the
null space of A, and N(A^T) is the null space of the transpose of A.

The linear system is solved using SNES, that is, F(x) = Ax-b = Ax -b_r -
b_n. Here b_n  in N(A^T),  and b_r in R(A).  During each nonlinear
iteration, a linear system A \delta x = F(x) is solved. N(A) is applied to
Krylov space  during the linear iterating. Before the actual solve
"(*ksp->ops->solve)(ksp)" for \delta x,  a temporary copy of F(x) is made,
F_tmp. N(A^T) is applied to F_tmp. We will get a \delta x.  F(x+\delta x )
= A(x+\delta x)-b_r - b_n.

F(x+\delta x ) always contain the vector b_n, and then the algorithm never
converges because the normal of F is at least 1.

Should we apply N(A^T) to F instead of F_tmp so that b_n can be removed
from F?

MatGetTransposeNullSpace(pmat,);
if (nullsp) {
   VecDuplicate(ksp->vec_rhs,);
   VecCopy(ksp->vec_rhs,btmp);
   MatNullSpaceRemove(nullsp,btmp);
   vec_rhs  = ksp->vec_rhs;
   ksp->vec_rhs = btmp;
}

should  be changed to

MatGetTransposeNullSpace(pmat,);
if (nullsp) {
   MatNullSpaceRemove(nullsp,ksp->vec_rhs);
}
???

Or other solutions to this issue?

Fande Kong,

On Thu, Oct 13, 2016 at 8:23 AM, Matthew Knepley <knep...@gmail.com> wrote:

> On Thu, Oct 13, 2016 at 9:06 AM, Kong, Fande <fande.k...@inl.gov> wrote:
>
>>
>>
>> On Wed, Oct 12, 2016 at 10:21 PM, Jed Brown <j...@jedbrown.org> wrote:
>>
>>> Barry Smith <bsm...@mcs.anl.gov> writes:
>>> >   I would make that a separate routine that the users would call first.
>>>
>>> We have VecMDot and VecMAXPY.  I would propose adding
>>>
>>>   VecQR(PetscInt nvecs,Vec *vecs,PetscScalar *R);
>>>
>>> (where R can be NULL).
>>>
>>
>> What does R mean here?
>>
>
> It means the coefficients of the old basis vectors in the new basis.
>
>   Matt
>
>
>> If nobody working on this, I will be going to take a try.
>>
>> Fande,
>>
>>
>>>
>>> Does anyone use the "Vecs" type?
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-13 Thread Kong, Fande

On Wed, Oct 12, 2016 at 10:21 PM, Jed Brown  wrote:

> Barry Smith  writes:
> >   I would make that a separate routine that the users would call first.
>
> We have VecMDot and VecMAXPY.  I would propose adding
>
>   VecQR(PetscInt nvecs,Vec *vecs,PetscScalar *R);
>
> (where R can be NULL).
>

What does R mean here?

If nobody working on this, I will be going to take a try.

Fande,

>
> Does anyone use the "Vecs" type?
>

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-11 Thread Kong, Fande

Barry,

I am trying to reproduce this issue using a pure PETSc code. VecLoad does
not work for me. I do not know why. Anyway, I can reproduce this using a
very small system.  Here are some info:

Mat, A
Mat Object:() 2 MPI processes
  type: mpiaij
row 0: (0, 1.)
row 1: (0, -0.820827)  (1, 1.51669)  (2, -0.820827)
row 2: (1, -0.820827)  (2, 1.51669)  (3, -0.820827)
row 3: (2, -0.820827)  (3, 1.51669)  (4, -0.820827)
row 4: (3, -0.820827)  (4, 1.51669)  (5, -0.820827)
row 5: (4, -0.820827)  (5, 1.51669)  (6, -0.820827)
row 6: (5, -0.820827)  (6, 1.51669)  (7, -0.820827)
row 7: (6, -0.820827)  (7, 1.51669)  (8, -0.820827)
row 8: (8, 1.)


Right hand side b:

Vec Object: 2 MPI processes
  type: mpi
Process [0]
0.
-0.356693
-0.50444
-0.356693
-5.55112e-17
Process [1]
0.356693
0.50444
0.356693
0.


Mat Null space N(A):

Vec Object: 2 MPI processes
  type: mpi
Process [0]
0.
0.191342
0.353553
0.46194
0.5
Process [1]
0.46194
0.353553
0.191342
6.12323e-17


Please run with two MPI threads using -ksp_pc_side right -pc_type bjacobi
and -ksp_pc_side left -pc_type bjacobi. Will produce different solutions.
The one obtained with using "left" is correct (we have an analytical
solution).

I also attached data for matrix, rhs and nullspace, but I am not sure if
you can read them or not. I can load mat.dat, but I could not read rhs.dat
and nullspace.dat.

Fande,



On Tue, Oct 11, 2016 at 3:44 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
>   Fande,
>
>  Could you send me (petsc-ma...@mcs.anl.gov) a non symmetric matrix
> you have that has a different null space for A and A'. This would be one
> that is failing with right preconditioning. Smaller the better but whatever
> size you have. Run the code with -ksp_view_mat binary and send the
> resulting file called binaryoutput.
>
>I need a test matrix to update the PETSc code for this case.
>
>
>    Barry
>
> > On Oct 11, 2016, at 3:04 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Tue, Oct 11, 2016 at 12:18 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >
> > > On Oct 11, 2016, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > >
> > >
> > > On Tue, Oct 11, 2016 at 10:39 AM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > > > On Oct 11, 2016, at 9:33 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Barry, Thanks so much for your explanation. It helps me a lot.
> > > >
> > > > On Mon, Oct 10, 2016 at 4:00 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > > >
> > > > > On Oct 10, 2016, at 4:01 PM, Kong, Fande <fande.k...@inl.gov>
> wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > I know how to remove the null spaces from a singular system using
> creating a MatNullSpace and attaching it to Mat.
> > > > >
> > > > > I was really wondering what is the philosophy behind this? The
> exact algorithms we are using in PETSc right now?  Where we are dealing
> with this, preconditioner, linear solver, or nonlinear solver?
> > > >
> > > >It is in the Krylov solver.
> > > >
> > > >The idea is very simple. Say you have   a singular A with null
> space N (that all values Ny are in the null space of A. So N is tall and
> skinny) and you want to solve A x = b where b is in the range of A. This
> problem has an infinite number of solutions Ny + x*  since A (Ny + x*)
> = ANy + Ax* = Ax* = b where x* is the "minimum norm solution; that is Ax* =
> b and x* has the smallest norm of all solutions.
> > > >
> > > >   With left preconditioning   B A x = B b GMRES, for example,
> normally computes the solution in the as alpha_1 Bb   + alpha_2 BABb +
> alpha_3 BABABAb +   but the B operator will likely introduce some
> component into the direction of the null space so as GMRES continues the
> "solution" computed will grow larger and larger with a large component in
> the null space of A. Hence we simply modify GMRES a tiny bit by building
> the solution from alpha_1 (I-N)Bb   + alpha_2 (I-N)BABb + alpha_3
> > > >
> > > >  Does "I" mean an identity matrix? Could you possibly send me a link
> for this GMRES implementation, that is, how PETSc does this in the actual
> code?
> > >
> > >Yes.
> > >
> > > It is in the helper routine KSP_PCApplyBAorAB()
> > > #undef __FUNCT__
> > > #define __FUNCT__ "KSP_PCApplyBAorAB"
> > > PETSC_STATIC_INLINE PetscErrorCode KSP_PCApplyBAorAB(KSP ksp,Vec x,Vec
> y,Vec w)
>

Re: [petsc-users] Algorithms to remove null spaces in a singular system

2016-10-11 Thread Kong, Fande

On Tue, Oct 11, 2016 at 12:18 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Oct 11, 2016, at 12:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> >
> >
> >
> > On Tue, Oct 11, 2016 at 10:39 AM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >
> > > On Oct 11, 2016, at 9:33 AM, Kong, Fande <fande.k...@inl.gov> wrote:
> > >
> > > Barry, Thanks so much for your explanation. It helps me a lot.
> > >
> > > On Mon, Oct 10, 2016 at 4:00 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> > >
> > > > On Oct 10, 2016, at 4:01 PM, Kong, Fande <fande.k...@inl.gov> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I know how to remove the null spaces from a singular system using
> creating a MatNullSpace and attaching it to Mat.
> > > >
> > > > I was really wondering what is the philosophy behind this? The exact
> algorithms we are using in PETSc right now?  Where we are dealing with
> this, preconditioner, linear solver, or nonlinear solver?
> > >
> > >It is in the Krylov solver.
> > >
> > >The idea is very simple. Say you have   a singular A with null
> space N (that all values Ny are in the null space of A. So N is tall and
> skinny) and you want to solve A x = b where b is in the range of A. This
> problem has an infinite number of solutions Ny + x*  since A (Ny + x*)
> = ANy + Ax* = Ax* = b where x* is the "minimum norm solution; that is Ax* =
> b and x* has the smallest norm of all solutions.
> > >
> > >   With left preconditioning   B A x = B b GMRES, for example,
> normally computes the solution in the as alpha_1 Bb   + alpha_2 BABb +
> alpha_3 BABABAb +   but the B operator will likely introduce some
> component into the direction of the null space so as GMRES continues the
> "solution" computed will grow larger and larger with a large component in
> the null space of A. Hence we simply modify GMRES a tiny bit by building
> the solution from alpha_1 (I-N)Bb   + alpha_2 (I-N)BABb + alpha_3
> > >
> > >  Does "I" mean an identity matrix? Could you possibly send me a link
> for this GMRES implementation, that is, how PETSc does this in the actual
> code?
> >
> >Yes.
> >
> > It is in the helper routine KSP_PCApplyBAorAB()
> > #undef __FUNCT__
> > #define __FUNCT__ "KSP_PCApplyBAorAB"
> > PETSC_STATIC_INLINE PetscErrorCode KSP_PCApplyBAorAB(KSP ksp,Vec x,Vec
> y,Vec w)
> > {
> >   PetscErrorCode ierr;
> >   PetscFunctionBegin;
> >   if (!ksp->transpose_solve) {
> > ierr = PCApplyBAorAB(ksp->pc,ksp->pc_side,x,y,w);CHKERRQ(ierr);
> > ierr = KSP_RemoveNullSpace(ksp,y);CHKERRQ(ierr);
> >   } else {
> > ierr = PCApplyBAorABTranspose(ksp->pc,ksp->pc_side,x,y,w);
> CHKERRQ(ierr);
> >   }
> >   PetscFunctionReturn(0);
> > }
> >
> >
> > PETSC_STATIC_INLINE PetscErrorCode KSP_RemoveNullSpace(KSP ksp,Vec y)
> > {
> >   PetscErrorCode ierr;
> >   PetscFunctionBegin;
> >   if (ksp->pc_side == PC_LEFT) {
> > Mat  A;
> > MatNullSpace nullsp;
> > ierr = PCGetOperators(ksp->pc,,NULL);CHKERRQ(ierr);
> > ierr = MatGetNullSpace(A,);CHKERRQ(ierr);
> > if (nullsp) {
> >   ierr = MatNullSpaceRemove(nullsp,y);CHKERRQ(ierr);
> > }
> >   }
> >   PetscFunctionReturn(0);
> > }
> >
> > "ksp->pc_side == PC_LEFT" deals with the left preconditioning Krylov
> methods only? How about the right preconditioning ones? Are  they just
> magically right for the right preconditioning Krylov methods?
>
>This is a good question. I am working on a branch now where I will add
> some more comprehensive testing of the various cases and fix anything that
> comes up.
>
>Were you having trouble with ASM and bjacobi only for right
> preconditioning?
>
>
Yes. ASM and bjacobi works fine for left preconditioning NOT for RIGHT
preconditioning. bjacobi converges, but produces a wrong solution. ASM
needs more iterations, however the solution is right.




> Note that when A is symmetric the range of A is orthogonal to null
> space of A so yes I think in that case it is just "magically right" but if
> A is not symmetric then I don't think it is "magically right". I'll work on
> it.
>
>
>Barry
>
> >
> > Fande Kong,
> >
> >
> > There is no code directly in the GMRES or other methods.
> >
> > >
> > > (I-N)BABABAb +   that

1 2 >

1 - 100 of 104 matches

Mail list logo