I think that's a reasonable solution. However, the words "not it"
come to mind. Sorry, but I have way too much on my plate this month.
By the way, in case no one noticed, I had e-mailed my findings to
devel. Someone might want to reply to Dorian's e-mail on users.
Brian
On Dec 11, 2008, at 2:31 PM, George Bosilca wrote:
Brian,
You're right, the datatype is being too cautious with the boundaries
when detecting the overlap. There is no good solution to detect the
overlap except parsing the whole memory layout to check the status
of every predefined type. As one can imagine this is a very
expensive operation. This is reason I preferred to use the true
extent and the size of the data to try to detect the overlap. This
approach is a lot faster, but has a poor accuracy.
The best solution I can think of in short term is to remove
completely the overlap check. This will have absolutely no impact on
the way we pack the data, but can lead to unexpected results when we
unpack and the data overlap. But I guess this can be considered as a
user error, as the MPI standard clearly state that the result of
such an operation is ... unexpected.
george.
On Dec 10, 2008, at 22:20 , Brian Barrett wrote:
Hi all -
I looked into this, and it appears to be datatype related. If the
displacements are set t o 3, 2, 1, 0, there the datatype will fail
the type checks for one-sided because is_overlapped() returns 1 for
the datatype. My reading of the standard seems to indicate this
should not be. I haven't looked into the problems with
displacement set to 0, 1, 2, 3, but I'm guessing it has something
to do with the reverse problem.
This looks like a datatype issue, so it's out of my realm of
expertise. Can someone else take a look?
Brian
Begin forwarded message:
From: doriankrause <doriankra...@web.de>
Date: December 10, 2008 4:07:55 PM MST
To: us...@open-mpi.org
Subject: [OMPI users] Onesided + derived datatypes
Reply-To: Open MPI Users <us...@open-mpi.org>
Hi List,
I have a MPI program which uses one sided communication with derived
datatypes (MPI_Type_create_indexed_block). I developed the code with
MPICH2 and unfortunately didn't thought about trying it out with
OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
facing
some problems. On the most machines I get an SIGSEGV in
MPI_Win_fence,
sometimes an invalid datatype shows up. I ran the program in
Valgrind
and didn't get anything valuable. Since I can't see a reason for
this
problem (at least if I understand the standard correctly), I wrote
the
attached testprogram.
Here are my experiences:
* If I compile without ONESIDED defined, everything works and V1
and V2
give the same results
* If I compile with ONESIDED and V2 defined (MPI_Type_contiguous)
it works.
* ONESIDED + V1 + O2: No errors but obviously nothing is send? (Am
I in
assuming that V1+O2 and V2 should be equivalent?)
* ONESIDED + V1 + O1:
[m02:03115] *** An error occurred in MPI_Put
[m02:03115] *** on win
[m02:03115] *** MPI_ERR_TYPE: invalid datatype
[m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
I didn't get a segfault as in the "real life example" but if
ompitest.cc
is correct it means that OpenMPI is buggy when it comes to onesided
communication and (some) derived datatypes, so that it is probably
not
of problem in my code.
I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
behaviour
can be be seen with gcc-3.3.1 and intel 10.1.
Please correct me if ompitest.cc contains errors. Otherwise I
would be
glad to hear how I should report these problems to the develepors
(if
they don't read this).
Thanks + best regards
Dorian
<ompitest.tar.gz>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel