Folks,

Mellanox Jenkins marks recent PR's as failed for very surprising reasons.


mpirun --mca btl sm,self ...


failed because processes could not contact each other. i was able to reproduce this once on my workstation,

and found the root cause was a dirty build and/or install dir.


i added some debug in autogen.sh and found that :

- the workspace (install dir) contains some old files

- it seems all PR's use the same workspace (if it was clean, that would be ok as long as Jenkins process only one PR at a time)

- there are currently two PR's being processed for the ompi-release repo, and per the log, they seem to use run from the very same directory

- Jenkins for the pmix repo seems to suffer the same issue


could someone have a look at this ?


Cheers,


Gilles

Reply via email to