No problem-o.
George -- can you please file a bug?
On Dec 13, 2008, at 3:11 PM, Brian Barrett wrote:
Sorry, I really won't have time to look until after Christmas. I'll
put it on the to-do list, but that's as soon as it has a prayer of
reaching the top.
Brian
On Dec 13, 2008, at 1:02 PM, George Bosilca wrote:
Brian,
I found a second problem with rebuilding the datatype on the
remote. Originally, the displacement were wrongly computed. This is
now fixed. However, the data at the end of the fence is still not
correct on the remote.
I can confirm that the packed message contains only 0 instead of
the real value, but I couldn't figure out how these 0 got there.
The pack function works correctly for the MPI_Send function, I
don't see any reason not to do the same for the MPI_Put. As you're
the one-sided guy in ompi, can you take a look at the MPI_Put to
see why the data is incorrect?
george.
On Dec 11, 2008, at 19:14 , Brian Barrett wrote:
I think that's a reasonable solution. However, the words "not it"
come to mind. Sorry, but I have way too much on my plate this
month. By the way, in case no one noticed, I had e-mailed my
findings to devel. Someone might want to reply to Dorian's e-mail
on users.
Brian
On Dec 11, 2008, at 2:31 PM, George Bosilca wrote:
Brian,
You're right, the datatype is being too cautious with the
boundaries when detecting the overlap. There is no good solution
to detect the overlap except parsing the whole memory layout to
check the status of every predefined type. As one can imagine
this is a very expensive operation. This is reason I preferred to
use the true extent and the size of the data to try to detect the
overlap. This approach is a lot faster, but has a poor accuracy.
The best solution I can think of in short term is to remove
completely the overlap check. This will have absolutely no impact
on the way we pack the data, but can lead to unexpected results
when we unpack and the data overlap. But I guess this can be
considered as a user error, as the MPI standard clearly state
that the result of such an operation is ... unexpected.
george.
On Dec 10, 2008, at 22:20 , Brian Barrett wrote:
Hi all -
I looked into this, and it appears to be datatype related. If
the displacements are set t o 3, 2, 1, 0, there the datatype
will fail the type checks for one-sided because is_overlapped()
returns 1 for the datatype. My reading of the standard seems to
indicate this should not be. I haven't looked into the problems
with displacement set to 0, 1, 2, 3, but I'm guessing it has
something to do with the reverse problem.
This looks like a datatype issue, so it's out of my realm of
expertise. Can someone else take a look?
Brian
Begin forwarded message:
From: doriankrause <doriankra...@web.de>
Date: December 10, 2008 4:07:55 PM MST
To: us...@open-mpi.org
Subject: [OMPI users] Onesided + derived datatypes
Reply-To: Open MPI Users <us...@open-mpi.org>
Hi List,
I have a MPI program which uses one sided communication with
derived
datatypes (MPI_Type_create_indexed_block). I developed the code
with
MPICH2 and unfortunately didn't thought about trying it out with
OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
facing
some problems. On the most machines I get an SIGSEGV in
MPI_Win_fence,
sometimes an invalid datatype shows up. I ran the program in
Valgrind
and didn't get anything valuable. Since I can't see a reason
for this
problem (at least if I understand the standard correctly), I
wrote the
attached testprogram.
Here are my experiences:
* If I compile without ONESIDED defined, everything works and
V1 and V2
give the same results
* If I compile with ONESIDED and V2 defined
(MPI_Type_contiguous) it works.
* ONESIDED + V1 + O2: No errors but obviously nothing is send?
(Am I in
assuming that V1+O2 and V2 should be equivalent?)
* ONESIDED + V1 + O1:
[m02:03115] *** An error occurred in MPI_Put
[m02:03115] *** on win
[m02:03115] *** MPI_ERR_TYPE: invalid datatype
[m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
I didn't get a segfault as in the "real life example" but if
ompitest.cc
is correct it means that OpenMPI is buggy when it comes to
onesided
communication and (some) derived datatypes, so that it is
probably not
of problem in my code.
I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
behaviour
can be be seen with gcc-3.3.1 and intel 10.1.
Please correct me if ompitest.cc contains errors. Otherwise I
would be
glad to hear how I should report these problems to the
develepors (if
they don't read this).
Thanks + best regards
Dorian
<ompitest.tar.gz>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
Cisco Systems