Hi All,
I think we have been discussing this topic for a while in other threads.
But I still did not get yet. PETSc uses 'SamePattern' as the default
FactPattern. Some test cases in MOOSE fail with this default option, but I
can make these tests pass if I set the FactPattern as
Hong,
I checked out & compiled your new branch:
hzhang/fix-superlu_dist-reuse-factornumeric. Unfortunately it did not
solve the problem.
Sorry.
On 11/21/2016 04:43 AM, Hong wrote:
Anton,
I pushed a fix
https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1
in
Thanks, Hong.
I will try as soon as possible and let you know.
Anton
On 11/21/2016 04:43 AM, Hong wrote:
Anton,
I pushed a fix
https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1
in branch hzhang/fix-superlu_dist-reuse-factornumeric.
Can you give it a try to see
Anton,
I pushed a fix
https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1
in branch hzhang/fix-superlu_dist-reuse-factornumeric.
Can you give it a try to see if it works?
I do not have an example which produces your problem.
In your email, you asked "Setting
Anton:
I am planning to work on this as soon as I get time. I assume that your
code is working with the option '-mat_superlu_dist_fact
SamePattern_SameRowPerm'. If not, let me know.
What I'm planing to do is to detect the existence of Pc and Pr in petsc
interface, then set reuse option, so users
On 10/27/2016 04:51 PM, Hong wrote:
Sherry,
Thanks for detailed explanation.
We use options.Fact = DOFACT as default for the first factorization.
When user reuses matrix factor, then we must provide a default,
either 'options.Fact = SamePattern' or 'SamePattern_SameRowPerm'.
We previously
Sherry,
Thanks for detailed explanation.
We use options.Fact = DOFACT as default for the first factorization. When
user reuses matrix factor, then we must provide a default,
either 'options.Fact = SamePattern' or 'SamePattern_SameRowPerm'.
We previously set 'SamePattern_SameRowPerm'. After a user
Some graph preprocessing steps can be skipped ONLY IF a previous
factorization was done, and the information can be reused (AS INPUT) to the
new factorization.
In general, the driver routine SRC/pdgssvx.c() performs the LU
factorization of the following (preprocessed) matrix:
Sherry,
We set '-mat_superlu_dist_fact SamePattern' as default in
petsc/superlu_dist on 12/6/15 (see attached email below).
However, Anton must set 'SamePattern_SameRowPerm' to avoid crash in his
code. Checking
Anton,
I guess, when you reuse matrix and its symbolic factor with updated
numerical values, superlu_dist requires this option. I'm cc'ing Sherry to
confirm it.
I'll check petsc/superlu-dist interface to set this flag for this case.
Hong
On Tue, Oct 25, 2016 at 8:20 AM, Anton Popov
Hong,
I get all the problems gone and valgrind-clean output if I specify this:
-mat_superlu_dist_fact SamePattern_SameRowPerm
What does SamePattern_SameRowPerm actually mean?
Row permutations are for large diagonal, column permutations are for
sparsity, right?
Will it skip subsequent matrix
On 10/25/2016 01:58 PM, Anton Popov wrote:
On 10/24/2016 10:32 PM, Barry Smith wrote:
Valgrind doesn't report any problems?
Valgrind hangs and never returns (waited hours for a 5 sec run) after
entering factorization for the second time.
Before it happens it prints this (attached)
On 10/24/2016 10:32 PM, Barry Smith wrote:
Valgrind doesn't report any problems?
Valgrind hangs and never returns (waited hours for a 5 sec run) after
entering factorization for the second time.
On Oct 24, 2016, at 12:09 PM, Anton Popov wrote:
On 10/24/2016
Valgrind doesn't report any problems?
> On Oct 24, 2016, at 12:09 PM, Anton Popov wrote:
>
>
>
> On 10/24/2016 05:47 PM, Hong wrote:
>> Barry,
>> Your change indeed fixed the error of his testing code.
>> As Satish tested, on your branch, ex16 runs smooth.
>>
>> I do
Anton:
>
> If replacing superlu_dist with mumps, does your code work?
>
> yes
>
You may use mumps in your code, or tests different options for superlu_dist:
-mat_superlu_dist_equil: Equilibrate matrix (None)
-mat_superlu_dist_rowperm Row permutation (choose one of)
LargeDiag NATURAL (None)
On 10/24/16 8:21 PM, Hong wrote:
Anton :
If replacing superlu_dist with mumps, does your code work?
yes
Hong
On 10/24/2016 05:47 PM, Hong wrote:
Barry,
Your change indeed fixed the error of his testing code.
As Satish tested, on your branch, ex16 runs smooth.
I do not
Anton :
If replacing superlu_dist with mumps, does your code work?
Hong
>
> On 10/24/2016 05:47 PM, Hong wrote:
>
> Barry,
> Your change indeed fixed the error of his testing code.
> As Satish tested, on your branch, ex16 runs smooth.
>
> I do not understand why on maint or master branch, ex16
On 10/24/2016 05:47 PM, Hong wrote:
Barry,
Your change indeed fixed the error of his testing code.
As Satish tested, on your branch, ex16 runs smooth.
I do not understand why on maint or master branch, ex16 creases inside
superlu_dist, but not with mumps.
I also confirm that ex16 runs
Barry,
Your change indeed fixed the error of his testing code.
As Satish tested, on your branch, ex16 runs smooth.
I do not understand why on maint or master branch, ex16 creases inside
superlu_dist, but not with mumps.
Hong
On Mon, Oct 24, 2016 at 9:34 AM, Satish Balay
On Mon, 24 Oct 2016, Barry Smith wrote:
>
> > [Or perhaps Hong is using a different test code and is observing bugs
> > with superlu_dist interface..]
>
>She states that her test does a NEW MatCreate() for each matrix load (I
> cut and pasted it in the email I just sent). The bug I fixed
> On Oct 24, 2016, at 9:24 AM, Kong, Fande wrote:
>
>
>
> On Mon, Oct 24, 2016 at 8:07 AM, Kong, Fande wrote:
>
>
> On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote:
>
>Thanks Satish,
>
> I have fixed this in
occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN.
>> I'll further look at it later.
>>
>> Hong
>>
>> From: Zhang, Hong
>> Sent: Friday, October 21, 2016 8:18 PM
>> To: Barry Smith; petsc-users
>&g
> The error occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN.
> I'll further look at it later.
>
> Hong
>
> From: Zhang, Hong
> Sent: Friday, October 21, 2016 8:18 PM
> To: Barry Smith; petsc-users
> Subject: RE: [petsc-users
On Mon, Oct 24, 2016 at 8:07 AM, Kong, Fande wrote:
>
>
> On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote:
>
>>
>>Thanks Satish,
>>
>> I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant
>> (in next for testing)
>>
>>
The error occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN.
I'll further look at it later.
Hong
From: Zhang, Hong
Sent: Friday, October 21, 2016 8:18 PM
To: Barry Smith; petsc-users
Subject: RE: [petsc-users] SuperLU_dist issue in 3.7.4
I am inv
On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote:
>
>Thanks Satish,
>
> I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant
> (in next for testing)
>
> Fande,
>
> This will also make MatMPIAIJSetPreallocation() work properly with
>
Since the provided test code dosn't crash [and is valgrind clean] -
with this fix - I'm not sure what bug Hong is chasing..
Satish
On Mon, 24 Oct 2016, Barry Smith wrote:
>
> Anton,
>
>Sorry for any confusion. This doesn't resolve the SuperLU_DIST issue which
> I think Hong is working
Anton,
Sorry for any confusion. This doesn't resolve the SuperLU_DIST issue which I
think Hong is working on, this only resolves multiple loads of matrices into
the same Mat.
Barry
> On Oct 24, 2016, at 5:07 AM, Anton Popov wrote:
>
> Thank you Barry, Satish,
Thank you Barry, Satish, Fande!
Is there a chance to get this fix in the maintenance release 3.7.5
together with the latest SuperLU_DIST? Or next release is a more
realistic option?
Anton
On 10/24/2016 01:58 AM, Satish Balay wrote:
The original testcode from Anton also works [i.e is
The original testcode from Anton also works [i.e is valgrind clean] with this
change..
Satish
On Sun, 23 Oct 2016, Barry Smith wrote:
>
>Thanks Satish,
>
> I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant (in
> next for testing)
>
> Fande,
>
> This
Thanks Satish,
I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant (in
next for testing)
Fande,
This will also make MatMPIAIJSetPreallocation() work properly with
multiple calls (you will not need a MatReset()).
Barry
> On Oct 21, 2016, at 6:48 PM,
Sent: Friday, October 21, 2016 8:18 PM
To: Barry Smith; petsc-users
Subject: RE: [petsc-users] SuperLU_dist issue in 3.7.4
I am investigating it. The file has two matrices. The code takes following
steps:
PCCreate(PETSC_COMM_WORLD, );
MatCreate(PETSC_COMM_WORLD,);
MatLoad(A,fd);
PCSetOperators
with np=2, superlu_dist, not with mumps/superlu or
superlu_dist np=1
Hong
From: Barry Smith [bsm...@mcs.anl.gov]
Sent: Friday, October 21, 2016 5:59 PM
To: petsc-users
Cc: Zhang, Hong
Subject: Re: [petsc-users] SuperLU_dist issue in 3.7.4
> On Oct 21, 2
On Fri, 21 Oct 2016, Barry Smith wrote:
>
> valgrind first
balay@asterix /home/balay/download-pine/x/superlu_dist_test
$ mpiexec -n 2 $VG ./ex16 -f ~/datafiles/matrices/small
First MatLoad!
Mat Object: 2 MPI processes
type: mpiaij
row 0: (0, 4.) (1, -1.) (6, -1.)
row 1: (0, -1.) (1,
valgrind first
> On Oct 21, 2016, at 6:33 PM, Satish Balay wrote:
>
> On Fri, 21 Oct 2016, Barry Smith wrote:
>
>>
>>> On Oct 21, 2016, at 5:16 PM, Satish Balay wrote:
>>>
>>> The issue with this test code is - using MatLoad() twice [with the
>>>
On Fri, 21 Oct 2016, Barry Smith wrote:
>
> > On Oct 21, 2016, at 5:16 PM, Satish Balay wrote:
> >
> > The issue with this test code is - using MatLoad() twice [with the
> > same object - without destroying it]. Not sure if thats supporsed to
> > work..
>
>If the file
> On Oct 21, 2016, at 5:16 PM, Satish Balay wrote:
>
> The issue with this test code is - using MatLoad() twice [with the
> same object - without destroying it]. Not sure if thats supporsed to
> work..
If the file has two matrices in it then yes a second call to MatLoad()
The issue with this test code is - using MatLoad() twice [with the
same object - without destroying it]. Not sure if thats supporsed to
work..
Satish
On Fri, 21 Oct 2016, Hong wrote:
> I can reproduce the error on a linux machine with petsc-maint. It crashes
> at 2nd solve, on both processors:
I can reproduce the error on a linux machine with petsc-maint. It crashes
at 2nd solve, on both processors:
Program received signal SIGSEGV, Segmentation fault.
0x7f051dc835bd in pdgsequ (A=0x1563910, r=0x176dfe0, c=0x178f7f0,
rowcnd=0x7fffcb8dab30, colcnd=0x7fffcb8dab38,
Thank you Sherry for your efforts
but before I can setup an example that reproduces the problem, I have to
ask PETSc related question.
When I pump matrix via MatView MatLoad it ignores its original partitioning.
Say originally I have 100 and 110 equations on two processors, after
MatLoad I
I looked at each valgrind-complained item in your email dated Oct. 11.
Those reports are really superficial; I don't see anything wrong with
those lines (mostly uninitialized variables) singled out. I did a few
tests with the latest version in github, all went fine.
Perhaps you can print your
On 10/11/16 7:19 PM, Satish Balay wrote:
This log looks truncated. Are there any valgrind mesages before this?
[like from your application code - or from MPI]
Yes it is indeed truncated. I only included relevant messages.
Perhaps you can send the complete log - with:
valgrind -q
On Tue, 11 Oct 2016, Anton wrote:
>
>
> On 10/11/16 7:44 PM, Barry Smith wrote:
> > You can run your code with -ksp_view_mat binary -ksp_view_rhs binary
> > this will cause it to save the matrices and right hand sides to the
> > linear systems in a file called binaryoutput, then
On 10/11/16 7:44 PM, Barry Smith wrote:
You can run your code with -ksp_view_mat binary -ksp_view_rhs binary this
will cause it to save the matrices and right hand sides to the linear systems
in a file called binaryoutput, then email the file to petsc-ma...@mcs.anl.gov
(don't worry this
You can run your code with -ksp_view_mat binary -ksp_view_rhs binary this
will cause it to save the matrices and right hand sides to the linear systems
in a file called binaryoutput, then email the file to petsc-ma...@mcs.anl.gov
(don't worry this email address accepts large attachments).
This log looks truncated. Are there any valgrind mesages before this?
[like from your application code - or from MPI]
Perhaps you can send the complete log - with:
valgrind -q --tool=memcheck --leak-check=yes --num-callers=20
--track-origins=yes
[and if there were more valgrind messages from
Valgrind immediately detects interesting stuff:
==25673== Use of uninitialised value of size 8
==25673==at 0x178272C: static_schedule (static_schedule.c:960)
==25674== Use of uninitialised value of size 8
==25674==at 0x178272C: static_schedule (static_schedule.c:960)
==25674==by
On 10/10/2016 07:11 PM, Satish Balay wrote:
Thats from petsc-3.5
Anton - please post the stack trace you get with
--download-superlu_dist-commit=origin/maint
I guess this is it:
[0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 421
Thats from petsc-3.5
Anton - please post the stack trace you get with
--download-superlu_dist-commit=origin/maint
Satish
On Mon, 10 Oct 2016, Xiaoye S. Li wrote:
> Which version of superlu_dist does this capture? I looked at the original
> error log, it pointed to pdgssvx: line 161. But
Which version of superlu_dist does this capture? I looked at the original
error log, it pointed to pdgssvx: line 161. But that line is in comment
block, not the program.
Sherry
On Mon, Oct 10, 2016 at 7:27 AM, Anton Popov wrote:
>
>
> On 10/07/2016 05:23 PM, Satish
On 10/07/2016 05:23 PM, Satish Balay wrote:
On Fri, 7 Oct 2016, Kong, Fande wrote:
On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote:
On Fri, 7 Oct 2016, Anton Popov wrote:
Hi guys,
are there any news about fixing buggy behavior of SuperLU_DIST, exactly
what
is
Fande,
If you can reproduce the problem with PETSc 3.7.4 please send us sample
code that produces it so we can work with Sherry to get it fixed ASAP.
Barry
> On Oct 7, 2016, at 10:23 AM, Satish Balay wrote:
>
> On Fri, 7 Oct 2016, Kong, Fande wrote:
>
>> On
On Fri, 7 Oct 2016, Kong, Fande wrote:
> On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote:
>
> > On Fri, 7 Oct 2016, Anton Popov wrote:
> >
> > > Hi guys,
> > >
> > > are there any news about fixing buggy behavior of SuperLU_DIST, exactly
> > what
> > > is described here:
>
On Fri, Oct 7, 2016 at 10:16 AM, Kong, Fande wrote:
> On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote:
>
>> On Fri, 7 Oct 2016, Anton Popov wrote:
>>
>> > Hi guys,
>> >
>> > are there any news about fixing buggy behavior of SuperLU_DIST, exactly
>> what
On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote:
> On Fri, 7 Oct 2016, Anton Popov wrote:
>
> > Hi guys,
> >
> > are there any news about fixing buggy behavior of SuperLU_DIST, exactly
> what
> > is described here:
> >
> >
On Fri, 7 Oct 2016, Anton Popov wrote:
> Hi guys,
>
> are there any news about fixing buggy behavior of SuperLU_DIST, exactly what
> is described here:
>
> http://lists.mcs.anl.gov/pipermail/petsc-users/2015-August/026802.html ?
>
> I'm using 3.7.4 and still get SEGV in pdgssvx routine.
Hi guys,
are there any news about fixing buggy behavior of SuperLU_DIST, exactly
what is described here:
http://lists.mcs.anl.gov/pipermail/petsc-users/2015-August/026802.html ?
I'm using 3.7.4 and still get SEGV in pdgssvx routine. Everything works
fine with 3.5.4.
Do I still have to
57 matches
Mail list logo