Hello,

I made a test to try to reproduce the error.
To do so I modified the file $PETSC_DIR/src/dm/examples/tests/ex35.c
I attach the file in case of need.

The same error is reproduced for 1024 mpi ranks. I tested two problem sizes 
(2*512+1x2*64+1x2*256+1 and 2*1024+1x2*128+1x2*512+1) and the error occured for 
both cases, the first case is also the one I used to run before the OS and mpi 
updates.
I also run the code with -malloc_debug and nothing more appeared.

I attached the configure command I used to build a debug version of petsc.

Thank you for your time,
Sincerly.
Anthony Jourdon


________________________________
De : Zhang, Junchao <[email protected]>
Envoyé : jeudi 16 janvier 2020 16:49
À : Anthony Jourdon <[email protected]>
Cc : [email protected] <[email protected]>
Objet : Re: [petsc-users] DMDA Error

It seems the problem is triggered by DMSetUp. You can write a small test 
creating the DMDA with the same size as your code, to see if you can reproduce 
the problem. If yes, it would be much easier for us to debug it.
--Junchao Zhang


On Thu, Jan 16, 2020 at 7:38 AM Anthony Jourdon 
<[email protected]<mailto:[email protected]>> wrote:

Dear Petsc developer,


I need assistance with an error.


I run a code that uses the DMDA related functions. I'm using petsc-3.8.4.


This code used to run very well on a super computer with the OS SLES11.

Petsc was built using an intel mpi 5.1.3.223 module and intel mkl version 
2016.0.2.181

The code was running with no problem on 1024 and more mpi ranks.


Recently, the OS of the computer has been updated to RHEL7

I rebuilt Petsc using new available versions of intel mpi (2019U5) and mkl 
(2019.0.5.281) which are the same versions for compilers and mkl.

Since then I tested to run the exact same code on 8, 16, 24, 48, 512 and 1024 
mpi ranks.

Until 1024 mpi ranks no problem, but for 1024 an error related to DMDA 
appeared. I snip the first lines of the error stack here and the full error 
stack is attached.


[534]PETSC ERROR: #1 PetscGatherMessageLengths() line 120 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/sys/utils/mpimesg.c

[534]PETSC ERROR: #2 VecScatterCreate_PtoS() line 2288 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vpscat.c

[534]PETSC ERROR: #3 VecScatterCreate() line 1462 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vscat.c

[534]PETSC ERROR: #4 DMSetUp_DA_3D() line 1042 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/da3.c

[534]PETSC ERROR: #5 DMSetUp_DA() line 25 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/dareg.c

[534]PETSC ERROR: #6 DMSetUp() line 720 in 
/scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/interface/dm.c



Thank you for your time,

Sincerly,


Anthony Jourdon

Attachment: Configure_petsc_debug
Description: Configure_petsc_debug

static char help[] = "MatLoad test for loading matrices that are created by 
DMCreateMatrix() and\n\
                      stored in binary via MatView_MPI_DA.MatView_MPI_DA stores 
the matrix\n\
                      in natural ordering. Hence MatLoad() has to read the 
matrix first in\n\
                      natural ordering and then permute it back to the 
application ordering.This\n\
                      example is used for testing the subroutine 
MatLoad_MPI_DA\n\n";

#include <petscdm.h>
#include <petscdmda.h>

int main(int argc,char **argv)
{
  PetscInt       X = 1024,Y = 128,Z=512;
  PetscErrorCode ierr;
  DM             da;
  PetscViewer    viewer;
  Mat            A;

  ierr = PetscInitialize(&argc,&argv,(char*)0,help);if (ierr) return ierr;
  ierr = 
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"temp.dat",FILE_MODE_WRITE,&viewer);CHKERRQ(ierr);

  /* Read options */
//  ierr = PetscOptionsGetInt(NULL,NULL,"-X",&X,NULL);CHKERRQ(ierr);
//  ierr = PetscOptionsGetInt(NULL,NULL,"-Y",&Y,NULL);CHKERRQ(ierr);
//  ierr = PetscOptionsGetInt(NULL,NULL,"-Z",&Z,NULL);CHKERRQ(ierr);
//  X = 512;
//  Y = 64;
//  Z = 256;
  /* Create distributed array and get vectors */
  ierr = 
DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_BOX,2*X+1,2*Y+1,2*Z+1,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,3,2,NULL,NULL,NULL,&da);CHKERRQ(ierr);
  ierr = DMSetFromOptions(da);CHKERRQ(ierr);
  ierr = DMSetUp(da);CHKERRQ(ierr); 
/*  ierr = DMSetMatType(da,MATMPIAIJ);CHKERRQ(ierr);
  ierr = DMCreateMatrix(da,&A);CHKERRQ(ierr);
  ierr = MatShift(A,X);CHKERRQ(ierr);
  ierr = MatView(A,viewer);CHKERRQ(ierr);
  ierr = MatDestroy(&A);CHKERRQ(ierr);
  ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr);

  ierr = 
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"temp.dat",FILE_MODE_READ,&viewer);CHKERRQ(ierr);
  ierr = DMCreateMatrix(da,&A);CHKERRQ(ierr);
  ierr = MatLoad(A,viewer);CHKERRQ(ierr);

*/  /* Free memory */
//  ierr = MatDestroy(&A);CHKERRQ(ierr);
  ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr);
  ierr = DMDestroy(&da);CHKERRQ(ierr);
  ierr = PetscFinalize();
  return ierr;
}

Attachment: TEST_DMDA_x512y64z256.err
Description: TEST_DMDA_x512y64z256.err

Attachment: TEST_DMDA_x1024y128z512.err
Description: TEST_DMDA_x1024y128z512.err

Reply via email to