Hi George,
thank you for replying and the hint of using MPI_BOTTOM. I changed this
part of the code and still receive the same segmentation fault.
Unfortunately I cannot post a full example, but here is the code that
seems most relevant to the problem.
The mechanism is as follows: From object that needs to be transmitted
a list is created which describes the members with their type, offset
and stride (the MemoryMapDescr). MemoryMap::mapType is used to put
the members into this list, the so called MemoryMap.
>From this vector of MemoryMapDescr a MPI_Datatype is constructed, which
then is used to transmit the object.
Maybe you could have a look at the code fragments and see if you spot
something that does not go well with OpenMPI.
The testing today showed again the behavior that the size of the
data structures triggers the problem. This can be either probabilistic
(more processing gives a higher chance that something goes wrong) or
that there is a real dependence, e.g. some buffer is too small or the
differences of the addresses in memory are too large, or I don't know
what else to think of.
Thank you for your help.
Regards,
Michael
int createMPIDataType(const std::vector& memorymap,
MPI_Datatype )
{
int err = MPI_SUCCESS;
int num = memorymap.size();
MPI_Datatype *types = new MPI_Datatype[num];
int *lengths = new int[num];
MPI_Aint *addresses = new MPI_Aint[num];
// copy the vector with information about the type in temp.
// arrays to be handled by MPI_Type_struct
for (int i = 0; i < num; i++)
{
types[i] = MPIDataType[memorymap[i].type];
lengths[i] = memorymap[i].len;
// create address map according to actual memory layout
err = MPI_Address(memorymap[i].addr, [i]);
if (err != MPI_SUCCESS)
{
std::ostringstream msg;
msg << "invalid address at index " << i;
msg << " for type " <<
DataTypeNames[memorymap[i].type];
msg << " at address " << memorymap[i].addr;
GP_THROW_ERR(CommunicationErr, eMPIAddressError,
msg.str());
}
}
// create MPI datatype with equivalent information about types and
offsets
err = MPI_Type_struct(num, lengths, addresses, types, );
if (err != MPI_SUCCESS)
{
GP_THROW_ERR(CommunicationErr, eMPIDatatypeError, "invalid
MPI datatype");
}
err = MPI_Type_commit();
// Invalid datatype argument. May be an uncommitted MPI_Datatype
(see MPI_Type_commit).
if (err != MPI_SUCCESS)
{
GP_THROW_ERR(CommunicationErr, eMPIDatatypeError, "invalid
MPI datatype");
}
// delete temp. arrays
delete [] types;
delete [] lengths;
delete [] addresses;
return err;
}
// Memory map descriptor.
// TODO: Add support for strided vectors.
struct MemoryMapDescr
{
MemoryMapDescr(DataType t, void* a, int l);
//! Data type.
DataType type;
//! Address of data in memory.
void* addr;
//! Number of data elements.
int len;
//! Stride.
// TODO: Add support for strided vectors.
int stride;
//! Type name string.
std::string typeName() const;
};
template
void MemoryMap::mapType(const T& var)
{
memoryMap_.push_back(MemoryMapDescr(DataTypeConverter::type,
(void*), 1));
}
// With specializations such as this following exemplified by a vector of
doubles.
template<>
void MemoryMap::mapType< std::vector >(const std::vector
)
{
if (var.size() > 0)
memoryMap_.push_back(MemoryMapDescr(DataTypeConverter::type,
(void*)[0], var.size()));
}
-
Message: 1
List-Post: users@lists.open-mpi.org
Date: Wed, 11 Apr 2007 12:33:25 -0400
From: George Bosilca
Subject: Re: [OMPI users] Open MPI - Signal: Segmentation fault (11)
Problem
To: Open MPI Users
Message-ID:
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Michael,
The MPI standard is quite clear. In order to have a correct and
portable MPI code, you are not allowed to use (void*)0. Use
MPI_BOTTOM instead.
We have plenty of tests which test the exact behavior you describe in
your email. And they all pass. I will take a look at what's happens
but I need either the code or at least the part which create the
datatype.
Thanks,
george.
On Apr 11, 2007, at 3:54 AM, Michael Gauckler wrote:
> Dear Open MPI User's and Developers,
>
> I encountered a problem with Open MPI when porting an application,
> which successfully ran with LAM MPI and MPICH.
>
> The program produces a segmentation fault