The issues have been identified deep into the tuned collective component. It
has been fixed in the trunk and 1.5 a while back, but never pushed in the 1.4.
I attached a patch to the ticket, and force its way into the next 1.4 release.
Thanks,
george.
On Feb 14, 2011, at 13:11 , Jeff
Thanks Jeremiah; I filed the following ticket about this:
https://svn.open-mpi.org/trac/ompi/ticket/2723
On Feb 10, 2011, at 3:24 PM, Jeremiah Willcock wrote:
> I forgot to mention that this was tested with 3 or 4 ranks, connected via TCP.
>
> -- Jeremiah Willcock
>
> On Thu, 10 Feb
I forgot to mention that this was tested with 3 or 4 ranks, connected via
TCP.
-- Jeremiah Willcock
On Thu, 10 Feb 2011, Jeremiah Willcock wrote:
Here is a small test case that hits the bug on 1.4.1:
#include
int arr[1142];
int main(int argc, char** argv) {
int rank, my_size;
Here is a small test case that hits the bug on 1.4.1:
#include
int arr[1142];
int main(int argc, char** argv) {
int rank, my_size;
MPI_Init(, );
MPI_Comm_rank(MPI_COMM_WORLD, );
my_size = (rank == 1) ? 1142 : 1088;
MPI_Bcast(arr, my_size, MPI_INT, 0, MPI_COMM_WORLD);
FYI, I am having trouble finding a small test case that will trigger this
on 1.5; I'm either getting deadlocks or MPI_ERR_TRUNCATE, so it could have
been fixed. What are the triggering rules for different broadcast
algorithms? It could be that only certain sizes or only certain BTLs
trigger
Nifty! Yes, I agree that that's a poor error message. It's probably
(unfortunately) being propagated up from the underlying point-to-point system,
where an ERR_IN_STATUS would actually make sense.
I'll file a ticket about this. Thanks for the heads up.
On Feb 9, 2011, at 4:49 PM, Jeremiah
On Wed, 9 Feb 2011, Jeremiah Willcock wrote:
I get the following Open MPI error from 1.4.1:
*** An error occurred in MPI_Bcast
*** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
*** MPI_ERR_IN_STATUS: error code in status
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
(hostname and