The issues have been identified deep into the tuned collective component. It 
has been fixed in the trunk and 1.5 a while back, but never pushed in the 1.4. 
I attached a patch to the ticket, and force its way into the next 1.4 release.

  Thanks,
    george.

On Feb 14, 2011, at 13:11 , Jeff Squyres wrote:

> Thanks Jeremiah; I filed the following ticket about this:
> 
>    https://svn.open-mpi.org/trac/ompi/ticket/2723
> 
> 
> On Feb 10, 2011, at 3:24 PM, Jeremiah Willcock wrote:
> 
>> I forgot to mention that this was tested with 3 or 4 ranks, connected via 
>> TCP.
>> 
>> -- Jeremiah Willcock
>> 
>> On Thu, 10 Feb 2011, Jeremiah Willcock wrote:
>> 
>>> Here is a small test case that hits the bug on 1.4.1:
>>> 
>>> #include <mpi.h>
>>> 
>>> int arr[1142];
>>> 
>>> int main(int argc, char** argv) {
>>> int rank, my_size;
>>> MPI_Init(&argc, &argv);
>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>> my_size = (rank == 1) ? 1142 : 1088;
>>> MPI_Bcast(arr, my_size, MPI_INT, 0, MPI_COMM_WORLD);
>>> MPI_Finalize();
>>> return 0;
>>> }
>>> 
>>> I tried it on 1.5.1, and I get MPI_ERR_TRUNCATE instead, so this might have 
>>> already been fixed.
>>> 
>>> -- Jeremiah Willcock
>>> 
>>> 
>>> On Thu, 10 Feb 2011, Jeremiah Willcock wrote:
>>> 
>>>> FYI, I am having trouble finding a small test case that will trigger this 
>>>> on 1.5; I'm either getting deadlocks or MPI_ERR_TRUNCATE, so it could have 
>>>> been fixed.  What are the triggering rules for different broadcast 
>>>> algorithms?  It could be that only certain sizes or only certain BTLs 
>>>> trigger it.
>>>> -- Jeremiah Willcock
>>>> On Thu, 10 Feb 2011, Jeff Squyres wrote:
>>>>> Nifty!  Yes, I agree that that's a poor error message.  It's probably 
>>>>> (unfortunately) being propagated up from the underlying point-to-point 
>>>>> system, where an ERR_IN_STATUS would actually make sense.
>>>>> I'll file a ticket about this.  Thanks for the heads up.
>>>>> On Feb 9, 2011, at 4:49 PM, Jeremiah Willcock wrote:
>>>>>> On Wed, 9 Feb 2011, Jeremiah Willcock wrote:
>>>>>>> I get the following Open MPI error from 1.4.1:
>>>>>>> *** An error occurred in MPI_Bcast
>>>>>>> *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
>>>>>>> *** MPI_ERR_IN_STATUS: error code in status
>>>>>>> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>>>>>> (hostname and port removed from each line).  There is no MPI_Status 
>>>>>>> returned by MPI_Bcast, so I don't know what the error is?  Is this 
>>>>>>> something that people have seen before?
>>>>>> For the record, this appears to be caused by specifying inconsistent 
>>>>>> data sizes on the different ranks in the broadcast operation.  The error 
>>>>>> message could still be improved, though.
>>>>>> -- Jeremiah Willcock
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> -- 
>>>>> Jeff Squyres
>>>>> jsquy...@cisco.com
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

"I disapprove of what you say, but I will defend to the death your right to say 
it"
  -- Evelyn Beatrice Hall


Reply via email to