This seems to fix my problem.  I have also decided that my original
production code that exposed this was a little dangerous, and has also been
improved because of this.  I didn't realize that MPI_UNDEFINED was returned,
and I was relying on it being zero or less, which it happens to be.

Thanks for your help,
Tom



On 4/1/06 1:44 AM, "George Bosilca" <bosi...@cs.utk.edu> wrote:

> There seems to be a sentence in the MPI standard about this case. The
> standard state:
> 
> If there is no active handle in the list it returns outcount =
> MPI_UNDEFINED.
> 
> Revision 9513 follow the standard.
> 
>    Thanks,
>      george.
> 
> 
> On Mar 31, 2006, at 6:38 PM, Brunner, Thomas A. wrote:
> 
>> Compiling revision 9505 of the trunk and building my original test
>> code now
>> core dumps.  I can run the test code with the Testsome line
>> commented out.
>> Here is the output from a brief gdb session:
>> 
>> --------------------------------------------------------------
>> 
>> gdb a.out /cores/core.28141
>> GNU gdb 6.1-20040303 (Apple version gdb-437) (Sun Dec 25 08:31:29
>> GMT 2005)
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License,
>> and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
>> details.
>> This GDB was configured as "powerpc-apple-darwin"...Reading symbols
>> for
>> shared libraries ..... done
>> 
>> Core was generated by `a.out'.
>> #0  0x010b2a90 in ?? ()
>> (gdb) bt
>> #0  0x010b2a90 in ?? ()
>> #1  0x010b2a3c in ?? ()
>> warning: Previous frame identical to this frame (corrupt stack?)
>> #2  0x00002c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
>> class/ompi_pointer_array.c:352
>> (gdb) up
>> #1  0x010b2a3c in ?? ()
>> (gdb) up
>> #2  0x00002c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
>> class/ompi_pointer_array.c:352
>> 352         if (table->size >= OMPI_FORTRAN_HANDLE_MAX) {
>> 
>> ---------------------------------------------------------------
>> This is the output from the code.
>> 
>> Hello from processor 0 of 1
>> Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
>> Failing at addr:0x0
>> *** End of error message ***
>> ------------------------------------------------------------
>> 
>> Perhaps in the MPI_Wait* and MPI_Test* functions, if incount==0, then
>> *outcount should be set to zero and immediately return?  (Of course
>> checking
>> that outcount !=0 too.)
>> 
>> Tom
>> 
>> 
>> 
>> On 3/31/06 1:35 PM, "George Bosilca" <bosi...@cs.utk.edu> wrote:
>> 
>>> When we're checking the arguments, we check for the request array to
>>> not be NULL without looking to the number of requests. I think it
>>> make sense, as I don't see why the user would call these functions
>>> with 0 requests ... But, the other way around make sense too. As I
>>> don't find anything in the MPI standard that stop the user doing that
>>> I add the additional check to all MPI_Wait* and MPI_Test* functions.
>>> 
>>> Please get the version from trunk after revision 9504.
>>> 
>>>    Thanks,
>>>      george.
>>> 
>>> On Mar 31, 2006, at 2:56 PM, Brunner, Thomas A. wrote:
>>> 
>>>> 
>>>> I have an algorithm that collects information in a tree like manner
>>>> using
>>>> nonblocking communication.  Some nodes do not receive information
>>>> from other
>>>> nodes, so there are no outstanding requests on those nodes.  On all
>>>> processors, I check for the incoming messages using MPI_Testsome().
>>>> MPI_Testsome fails with OpenMPI, however if the request length is
>>>> zero.
>>>> Here is a code that can be run with only one processor that shows
>>>> the same
>>>> behavior:
>>>> 
>>>> ///////////////////////////////////////////
>>>> 
>>>> #include "mpi.h"
>>>> #include <stdio.h>
>>>> 
>>>> int main( int argc, char *argv[])
>>>> {
>>>>     int myid, numprocs;
>>>> 
>>>>     MPI_Init(&argc,&argv);
>>>>     MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
>>>>     MPI_Comm_rank(MPI_COMM_WORLD,&myid);
>>>> 
>>>>     printf("Hello from processor %i of %i\n", myid, numprocs);
>>>> 
>>>>     int size = 0;
>>>>     int num_done = 0;
>>>>     MPI_Status* stat = 0;
>>>>     MPI_Request* req = 0;
>>>>     int* done_indices = 0;
>>>> 
>>>>     MPI_Testsome( size, req, &num_done, done_indices, stat);
>>>> 
>>>>     printf("Finalizing on processor %i of %i\n", myid, numprocs);
>>>> 
>>>>     MPI_Finalize();
>>>> 
>>>>     return 0;
>>>> }
>>>> 
>>>> /////////////////////////////////////////
>>>> 
>>>> The output using OpenMPI is:
>>>> 
>>>> Hello from processor 0 of 1
>>>> [mymachine:09115] *** An error occurred in MPI_Testsome
>>>> [mymachine:09115] *** on communicator MPI_COMM_WORLD
>>>> [mymachine:09115] *** MPI_ERR_REQUEST: invalid request
>>>> [mymachine:09115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>> 1 process killed (possibly by Open MPI)
>>>> 
>>>> 
>>>> Many other MPI implementations support this, and reading the
>>>> standard, it
>>>> seems like it should be OK.
>>>> 
>>>> Thanks,
>>>> Tom
>>>> 
>>>> <config.log.bz2>
>>>> <testsome_test.out>
>>>> <testsome_test.c>
>>>> <ompi_info.out>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Reply via email to