ok i have to wait until tomorrow, they have some problems with the
network...
On 09/18/2014 01:27 PM, Nick Papior Andersen wrote:
I am not sure whether test will cover this... You should check it...
I here attach my example script which shows two working cases, and one
not workning (you can check the memory usage simultaneously and see
that the first two works, the last one goes ballistic in memory).
Just check it with test to see if it works...
2014-09-18 13:20 GMT+02:00 XingFENG <xingf...@cse.unsw.edu.au
<mailto:xingf...@cse.unsw.edu.au>>:
Thanks very much for your reply!
To Sir Jeff Squyres:
I think it fails due to truncation errors. I am now logging
information of each send and receive to find out the reason.
To Sir Nick Papior Andersen:
Oh, wait (mpi_wait) is never called in my codes.
What I do is to call MPI_Irecv once. Then MPI_Test is called
several times to check whether new messages are available. If new
messages are available, some functions to process these messages
are called.
I will add the wait function and check the running results.
On Thu, Sep 18, 2014 at 8:47 PM, Nick Papior Andersen
<nickpap...@gmail.com <mailto:nickpap...@gmail.com>> wrote:
In complement to Jeff, I would add that using asynchronous
messages REQUIRES that you wait (mpi_wait) for all messages at
some point. Even though this might not seem obvious it is due
to memory allocation "behind the scenes" which are only
de-allocated upon completion through a wait statement.
2014-09-18 12:36 GMT+02:00 Jeff Squyres (jsquyres)
<jsquy...@cisco.com <mailto:jsquy...@cisco.com>>:
On Sep 18, 2014, at 2:43 AM, XingFENG
<xingf...@cse.unsw.edu.au
<mailto:xingf...@cse.unsw.edu.au>> wrote:
> a. How to get more information about errors? I got
errors like below. This says that program exited
abnormally in function MPI_Test(). But is there a way to
know more about the error?
>
> *** An error occurred in MPI_Test
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_TRUNCATE: message truncated
> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
For the purpose of this discussion, let's take a
simplification that you are sending and receiving the same
datatypes (e.g., you're sending MPI_INT and you're
receiving MPI_INT).
This error means that you tried to receive message with
too small a buffer.
Specifically, MPI says that if you send a message that is
X element long (e.g., 20 MPI_INTs), then the matching
receive must be Y elements, where Y>=X (e.g., *at least*
20 MPI_INTs). If the receiver provides a Y where Y<X,
this is a truncation error.
Unfortunately, Open MPI doesn't report a whole lot more
information about these kinds of errors than what you're
seeing, sorry.
> b. Are there anything to note about asynchronous
communication? I use MPI_Isend, MPI_Irecv, MPI_Test to
implement asynchronous communication. My program works
well on small data sets(10K nodes graphs), but it exits
abnormally on large data set (1M nodes graphs).
Is it failing due to truncation errors, or something else?
--
Jeff Squyres
jsquy...@cisco.com <mailto:jsquy...@cisco.com>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription:
http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25344.php
--
Kind regards Nick
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25345.php
--
Best Regards.
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25346.php
--
Kind regards Nick
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25347.php