Hello Eric,

Is the failure seen with the same two tests?  Or is it random
which tests fail?  If its not random, would you be able to post
the tests to the list?

Also,  if possible, it would be great if you could test against a master
snapshot:

https://www.open-mpi.org/nightly/master/


Thanks,

Howard

-- 
Howard Pritchard

HPC-DES
Los Alamos National Laboratory





On 9/13/16, 9:38 AM, "devel on behalf of Eric Chamberland"
<devel-boun...@lists.open-mpi.org on behalf of
eric.chamberl...@giref.ulaval.ca> wrote:

>Other relevant info: I never saw this problem with OpenMPI 1.6.5,1.8.4
>and 1.10.[3,4] which runs the same test suite...
>
>thanks,
>
>Eric
>
>
>On 13/09/16 11:35 AM, Eric Chamberland wrote:
>> Hi,
>>
>> It is the third time this happened into the last 10 days.
>>
>> While running nighlty tests (~2200), we have one or two tests that fails
>> at the very beginning with this strange error:
>>
>> [lorien:142766] [[9325,5754],0] usock_peer_recv_connect_ack: received
>> unexpected process identifier [[9325,0],0] from [[5590,0],0]
>>
>> But I can't reproduce the problem right now... ie: If I launch this test
>> alone "by hand", it is successful... the same test was successful
>> yesterday...
>>
>> Is there some kind of "race condition" that can happen on the creation
>> of "tmp" files if many tests runs together on the same node? (we are
>> oversubcribing even sequential runs...)
>>
>> Here are the build logs:
>>
>> 
>>http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.09.13.01h16m01s_co
>>nfig.log
>>
>> 
>>http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.09.13.01h16m01s_om
>>pi_info_all.txt
>>
>>
>> Thanks,
>>
>> Eric
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>_______________________________________________
>devel mailing list
>devel@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to