Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-11-26 Thread Anton Popov
Hong, I checked out & compiled your new branch: hzhang/fix-superlu_dist-reuse-factornumeric. Unfortunately it did not solve the problem. Sorry. On 11/21/2016 04:43 AM, Hong wrote: Anton, I pushed a fix https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1 in

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-11-21 Thread Anton Popov
Thanks, Hong. I will try as soon as possible and let you know. Anton On 11/21/2016 04:43 AM, Hong wrote: Anton, I pushed a fix https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1 in branch hzhang/fix-superlu_dist-reuse-factornumeric. Can you give it a try to see

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-11-20 Thread Hong
Anton, I pushed a fix https://bitbucket.org/petsc/petsc/commits/28865de08051eb99557d70672c208e14da23c8b1 in branch hzhang/fix-superlu_dist-reuse-factornumeric. Can you give it a try to see if it works? I do not have an example which produces your problem. In your email, you asked "Setting

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-11-07 Thread Hong
Anton: I am planning to work on this as soon as I get time. I assume that your code is working with the option '-mat_superlu_dist_fact SamePattern_SameRowPerm'. If not, let me know. What I'm planing to do is to detect the existence of Pc and Pr in petsc interface, then set reuse option, so users

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-11-07 Thread Anton Popov
On 10/27/2016 04:51 PM, Hong wrote: Sherry, Thanks for detailed explanation. We use options.Fact = DOFACT as default for the first factorization. When user reuses matrix factor, then we must provide a default, either 'options.Fact = SamePattern' or 'SamePattern_SameRowPerm'. We previously

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-27 Thread Hong
Sherry, Thanks for detailed explanation. We use options.Fact = DOFACT as default for the first factorization. When user reuses matrix factor, then we must provide a default, either 'options.Fact = SamePattern' or 'SamePattern_SameRowPerm'. We previously set 'SamePattern_SameRowPerm'. After a user

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-26 Thread Xiaoye S. Li
Some graph preprocessing steps can be skipped ONLY IF a previous factorization was done, and the information can be reused (AS INPUT) to the new factorization. In general, the driver routine SRC/pdgssvx.c() performs the LU factorization of the following (preprocessed) matrix:

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-25 Thread Hong
Sherry, We set '-mat_superlu_dist_fact SamePattern' as default in petsc/superlu_dist on 12/6/15 (see attached email below). However, Anton must set 'SamePattern_SameRowPerm' to avoid crash in his code. Checking

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-25 Thread Hong
Anton, I guess, when you reuse matrix and its symbolic factor with updated numerical values, superlu_dist requires this option. I'm cc'ing Sherry to confirm it. I'll check petsc/superlu-dist interface to set this flag for this case. Hong On Tue, Oct 25, 2016 at 8:20 AM, Anton Popov

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-25 Thread Anton Popov
Hong, I get all the problems gone and valgrind-clean output if I specify this: -mat_superlu_dist_fact SamePattern_SameRowPerm What does SamePattern_SameRowPerm actually mean? Row permutations are for large diagonal, column permutations are for sparsity, right? Will it skip subsequent matrix

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-25 Thread Anton Popov
On 10/25/2016 01:58 PM, Anton Popov wrote: On 10/24/2016 10:32 PM, Barry Smith wrote: Valgrind doesn't report any problems? Valgrind hangs and never returns (waited hours for a 5 sec run) after entering factorization for the second time. Before it happens it prints this (attached)

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-25 Thread Anton Popov
On 10/24/2016 10:32 PM, Barry Smith wrote: Valgrind doesn't report any problems? Valgrind hangs and never returns (waited hours for a 5 sec run) after entering factorization for the second time. On Oct 24, 2016, at 12:09 PM, Anton Popov wrote: On 10/24/2016

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Barry Smith
Valgrind doesn't report any problems? > On Oct 24, 2016, at 12:09 PM, Anton Popov wrote: > > > > On 10/24/2016 05:47 PM, Hong wrote: >> Barry, >> Your change indeed fixed the error of his testing code. >> As Satish tested, on your branch, ex16 runs smooth. >> >> I do

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Hong
Anton: > > If replacing superlu_dist with mumps, does your code work? > > yes > You may use mumps in your code, or tests different options for superlu_dist: -mat_superlu_dist_equil: Equilibrate matrix (None) -mat_superlu_dist_rowperm Row permutation (choose one of) LargeDiag NATURAL (None)

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Anton
On 10/24/16 8:21 PM, Hong wrote: Anton : If replacing superlu_dist with mumps, does your code work? yes Hong On 10/24/2016 05:47 PM, Hong wrote: Barry, Your change indeed fixed the error of his testing code. As Satish tested, on your branch, ex16 runs smooth. I do not

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Hong
Anton : If replacing superlu_dist with mumps, does your code work? Hong > > On 10/24/2016 05:47 PM, Hong wrote: > > Barry, > Your change indeed fixed the error of his testing code. > As Satish tested, on your branch, ex16 runs smooth. > > I do not understand why on maint or master branch, ex16

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Anton Popov
On 10/24/2016 05:47 PM, Hong wrote: Barry, Your change indeed fixed the error of his testing code. As Satish tested, on your branch, ex16 runs smooth. I do not understand why on maint or master branch, ex16 creases inside superlu_dist, but not with mumps. I also confirm that ex16 runs

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Hong
Barry, Your change indeed fixed the error of his testing code. As Satish tested, on your branch, ex16 runs smooth. I do not understand why on maint or master branch, ex16 creases inside superlu_dist, but not with mumps. Hong On Mon, Oct 24, 2016 at 9:34 AM, Satish Balay

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Satish Balay
On Mon, 24 Oct 2016, Barry Smith wrote: > > > [Or perhaps Hong is using a different test code and is observing bugs > > with superlu_dist interface..] > >She states that her test does a NEW MatCreate() for each matrix load (I > cut and pasted it in the email I just sent). The bug I fixed

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Barry Smith
> On Oct 24, 2016, at 9:24 AM, Kong, Fande wrote: > > > > On Mon, Oct 24, 2016 at 8:07 AM, Kong, Fande wrote: > > > On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote: > >Thanks Satish, > > I have fixed this in

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Barry Smith
occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN. >> I'll further look at it later. >> >> Hong >> >> From: Zhang, Hong >> Sent: Friday, October 21, 2016 8:18 PM >> To: Barry Smith; petsc-users >&g

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Satish Balay
> The error occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN. > I'll further look at it later. > > Hong > > From: Zhang, Hong > Sent: Friday, October 21, 2016 8:18 PM > To: Barry Smith; petsc-users > Subject: RE: [petsc-users

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Kong, Fande
On Mon, Oct 24, 2016 at 8:07 AM, Kong, Fande wrote: > > > On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote: > >> >>Thanks Satish, >> >> I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant >> (in next for testing) >> >>

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Barry Smith
The error occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN. I'll further look at it later. Hong From: Zhang, Hong Sent: Friday, October 21, 2016 8:18 PM To: Barry Smith; petsc-users Subject: RE: [petsc-users] SuperLU_dist issue in 3.7.4 I am inv

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Kong, Fande
On Sun, Oct 23, 2016 at 3:56 PM, Barry Smith wrote: > >Thanks Satish, > > I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant > (in next for testing) > > Fande, > > This will also make MatMPIAIJSetPreallocation() work properly with >

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Satish Balay
Since the provided test code dosn't crash [and is valgrind clean] - with this fix - I'm not sure what bug Hong is chasing.. Satish On Mon, 24 Oct 2016, Barry Smith wrote: > > Anton, > >Sorry for any confusion. This doesn't resolve the SuperLU_DIST issue which > I think Hong is working

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Barry Smith
Anton, Sorry for any confusion. This doesn't resolve the SuperLU_DIST issue which I think Hong is working on, this only resolves multiple loads of matrices into the same Mat. Barry > On Oct 24, 2016, at 5:07 AM, Anton Popov wrote: > > Thank you Barry, Satish,

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-24 Thread Anton Popov
Thank you Barry, Satish, Fande! Is there a chance to get this fix in the maintenance release 3.7.5 together with the latest SuperLU_DIST? Or next release is a more realistic option? Anton On 10/24/2016 01:58 AM, Satish Balay wrote: The original testcode from Anton also works [i.e is

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-23 Thread Satish Balay
The original testcode from Anton also works [i.e is valgrind clean] with this change.. Satish On Sun, 23 Oct 2016, Barry Smith wrote: > >Thanks Satish, > > I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant (in > next for testing) > > Fande, > > This

Re: [petsc-users] SuperLU_dist issue in 3.7.4 failure of repeated calls to MatLoad() or MatMPIAIJSetPreallocation() with the same matrix

2016-10-23 Thread Barry Smith
Thanks Satish, I have fixed this in barry/fix-matmpixxxsetpreallocation-reentrant (in next for testing) Fande, This will also make MatMPIAIJSetPreallocation() work properly with multiple calls (you will not need a MatReset()). Barry > On Oct 21, 2016, at 6:48 PM,

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Zhang, Hong
Sent: Friday, October 21, 2016 8:18 PM To: Barry Smith; petsc-users Subject: RE: [petsc-users] SuperLU_dist issue in 3.7.4 I am investigating it. The file has two matrices. The code takes following steps: PCCreate(PETSC_COMM_WORLD, ); MatCreate(PETSC_COMM_WORLD,); MatLoad(A,fd); PCSetOperators

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Zhang, Hong
with np=2, superlu_dist, not with mumps/superlu or superlu_dist np=1 Hong From: Barry Smith [bsm...@mcs.anl.gov] Sent: Friday, October 21, 2016 5:59 PM To: petsc-users Cc: Zhang, Hong Subject: Re: [petsc-users] SuperLU_dist issue in 3.7.4 > On Oct 21, 2

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Satish Balay
On Fri, 21 Oct 2016, Barry Smith wrote: > > valgrind first balay@asterix /home/balay/download-pine/x/superlu_dist_test $ mpiexec -n 2 $VG ./ex16 -f ~/datafiles/matrices/small First MatLoad! Mat Object: 2 MPI processes type: mpiaij row 0: (0, 4.) (1, -1.) (6, -1.) row 1: (0, -1.) (1,

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Barry Smith
valgrind first > On Oct 21, 2016, at 6:33 PM, Satish Balay wrote: > > On Fri, 21 Oct 2016, Barry Smith wrote: > >> >>> On Oct 21, 2016, at 5:16 PM, Satish Balay wrote: >>> >>> The issue with this test code is - using MatLoad() twice [with the >>>

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Satish Balay
On Fri, 21 Oct 2016, Barry Smith wrote: > > > On Oct 21, 2016, at 5:16 PM, Satish Balay wrote: > > > > The issue with this test code is - using MatLoad() twice [with the > > same object - without destroying it]. Not sure if thats supporsed to > > work.. > >If the file

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Barry Smith
> On Oct 21, 2016, at 5:16 PM, Satish Balay wrote: > > The issue with this test code is - using MatLoad() twice [with the > same object - without destroying it]. Not sure if thats supporsed to > work.. If the file has two matrices in it then yes a second call to MatLoad()

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Satish Balay
The issue with this test code is - using MatLoad() twice [with the same object - without destroying it]. Not sure if thats supporsed to work.. Satish On Fri, 21 Oct 2016, Hong wrote: > I can reproduce the error on a linux machine with petsc-maint. It crashes > at 2nd solve, on both processors:

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-21 Thread Hong
I can reproduce the error on a linux machine with petsc-maint. It crashes at 2nd solve, on both processors: Program received signal SIGSEGV, Segmentation fault. 0x7f051dc835bd in pdgsequ (A=0x1563910, r=0x176dfe0, c=0x178f7f0, rowcnd=0x7fffcb8dab30, colcnd=0x7fffcb8dab38,

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-19 Thread Anton Popov
Thank you Sherry for your efforts but before I can setup an example that reproduces the problem, I have to ask PETSc related question. When I pump matrix via MatView MatLoad it ignores its original partitioning. Say originally I have 100 and 110 equations on two processors, after MatLoad I

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-19 Thread Xiaoye S. Li
I looked at each valgrind-complained item in your email dated Oct. 11. Those reports are really superficial; I don't see anything wrong with those lines (mostly uninitialized variables) singled out. I did a few tests with the latest version in github, all went fine. Perhaps you can print your

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Anton
On 10/11/16 7:19 PM, Satish Balay wrote: This log looks truncated. Are there any valgrind mesages before this? [like from your application code - or from MPI] Yes it is indeed truncated. I only included relevant messages. Perhaps you can send the complete log - with: valgrind -q

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Satish Balay
On Tue, 11 Oct 2016, Anton wrote: > > > On 10/11/16 7:44 PM, Barry Smith wrote: > > You can run your code with -ksp_view_mat binary -ksp_view_rhs binary > > this will cause it to save the matrices and right hand sides to the > > linear systems in a file called binaryoutput, then

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Anton
On 10/11/16 7:44 PM, Barry Smith wrote: You can run your code with -ksp_view_mat binary -ksp_view_rhs binary this will cause it to save the matrices and right hand sides to the linear systems in a file called binaryoutput, then email the file to petsc-ma...@mcs.anl.gov (don't worry this

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Barry Smith
You can run your code with -ksp_view_mat binary -ksp_view_rhs binary this will cause it to save the matrices and right hand sides to the linear systems in a file called binaryoutput, then email the file to petsc-ma...@mcs.anl.gov (don't worry this email address accepts large attachments).

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Satish Balay
This log looks truncated. Are there any valgrind mesages before this? [like from your application code - or from MPI] Perhaps you can send the complete log - with: valgrind -q --tool=memcheck --leak-check=yes --num-callers=20 --track-origins=yes [and if there were more valgrind messages from

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Anton Popov
Valgrind immediately detects interesting stuff: ==25673== Use of uninitialised value of size 8 ==25673==at 0x178272C: static_schedule (static_schedule.c:960) ==25674== Use of uninitialised value of size 8 ==25674==at 0x178272C: static_schedule (static_schedule.c:960) ==25674==by

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-11 Thread Anton Popov
On 10/10/2016 07:11 PM, Satish Balay wrote: Thats from petsc-3.5 Anton - please post the stack trace you get with --download-superlu_dist-commit=origin/maint I guess this is it: [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 421

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-10 Thread Satish Balay
Thats from petsc-3.5 Anton - please post the stack trace you get with --download-superlu_dist-commit=origin/maint Satish On Mon, 10 Oct 2016, Xiaoye S. Li wrote: > Which version of superlu_dist does this capture? I looked at the original > error log, it pointed to pdgssvx: line 161. But

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-10 Thread Xiaoye S. Li
Which version of superlu_dist does this capture? I looked at the original error log, it pointed to pdgssvx: line 161. But that line is in comment block, not the program. Sherry On Mon, Oct 10, 2016 at 7:27 AM, Anton Popov wrote: > > > On 10/07/2016 05:23 PM, Satish

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-10 Thread Anton Popov
On 10/07/2016 05:23 PM, Satish Balay wrote: On Fri, 7 Oct 2016, Kong, Fande wrote: On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote: On Fri, 7 Oct 2016, Anton Popov wrote: Hi guys, are there any news about fixing buggy behavior of SuperLU_DIST, exactly what is

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Barry Smith
Fande, If you can reproduce the problem with PETSc 3.7.4 please send us sample code that produces it so we can work with Sherry to get it fixed ASAP. Barry > On Oct 7, 2016, at 10:23 AM, Satish Balay wrote: > > On Fri, 7 Oct 2016, Kong, Fande wrote: > >> On

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Satish Balay
On Fri, 7 Oct 2016, Kong, Fande wrote: > On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote: > > > On Fri, 7 Oct 2016, Anton Popov wrote: > > > > > Hi guys, > > > > > > are there any news about fixing buggy behavior of SuperLU_DIST, exactly > > what > > > is described here: >

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Matthew Knepley
On Fri, Oct 7, 2016 at 10:16 AM, Kong, Fande wrote: > On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote: > >> On Fri, 7 Oct 2016, Anton Popov wrote: >> >> > Hi guys, >> > >> > are there any news about fixing buggy behavior of SuperLU_DIST, exactly >> what

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Kong, Fande
On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay wrote: > On Fri, 7 Oct 2016, Anton Popov wrote: > > > Hi guys, > > > > are there any news about fixing buggy behavior of SuperLU_DIST, exactly > what > > is described here: > > > >

Re: [petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Satish Balay
On Fri, 7 Oct 2016, Anton Popov wrote: > Hi guys, > > are there any news about fixing buggy behavior of SuperLU_DIST, exactly what > is described here: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2015-August/026802.html ? > > I'm using 3.7.4 and still get SEGV in pdgssvx routine.

[petsc-users] SuperLU_dist issue in 3.7.4

2016-10-07 Thread Anton Popov
Hi guys, are there any news about fixing buggy behavior of SuperLU_DIST, exactly what is described here: http://lists.mcs.anl.gov/pipermail/petsc-users/2015-August/026802.html ? I'm using 3.7.4 and still get SEGV in pdgssvx routine. Everything works fine with 3.5.4. Do I still have to