Hi Everyone,

I get a strange error during a call to MatAssemblyBegin. The error message is 
triggered by Intel MPI, as shown below. The error does not always occurs, which 
is even more strange.
[333:node1179] unexpected disconnect completion event from [163:node1254]
Assertion failed in file ../../dapl_conn_rc.c at line 1128: 0

All ranks output the same error message with their own node number. I did a bit 
of research and some say that MPICH2 solves this issue. Since our group is keen 
in using Intel MPI, I would like to solves this issue at the root.

A few important points:

·         At the moment, we are assembling the matrix with a single 
MatAssembleBegin/End and MAT_FINAL_ASSEMBLY after doing MatSetValuesBlocked. 
Can it be due to memory overflow in the buffers?

·         We are using -genv I_MPI_FABRICS shm:dapl in the submission script

·         I tried using -malloc_log and -log_summary, but the crash prevents 
writing the log ouput

Has anyone of you already faced this issue?
Any suggestion is welcome,
Best regards,
Antoine DeBlois

Antoine DeBlois
Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead
Aéronautique / Aerospace
514-855-5001, x 50862
[email protected]<mailto:[email protected]>

2351 Blvd Alfred-Nobel
Montreal, Qc
H4S 1A9

[Description : Description : 
http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg]
CONFIDENTIALITY NOTICE - This communication may contain privileged or 
confidential information.
If you are not the intended recipient or received this communication by error, 
please notify the sender
and delete the message without copying

Reply via email to