Thank you for your reply! I am still working on my codes. I would update the post when I fix bugs.
On Thu, Sep 18, 2014 at 9:48 PM, Nick Papior Andersen <nickpap...@gmail.com> wrote: > I just checked, if the tests return "Received" for all messages it will > not go into memory burst. > Hence doing MPI_Test will be enough. :) > > Hence, if at any time the mpi-layer is notified about the success of a > send/recv it will clean up. This makes sense :) > > See the updated code. > > 2014-09-18 13:39 GMT+02:00 Tobias Kloeffel <tobias.kloef...@fau.de>: > >> ok i have to wait until tomorrow, they have some problems with the >> network... >> >> >> >> >> On 09/18/2014 01:27 PM, Nick Papior Andersen wrote: >> >> I am not sure whether test will cover this... You should check it... >> >> >> I here attach my example script which shows two working cases, and one >> not workning (you can check the memory usage simultaneously and see that >> the first two works, the last one goes ballistic in memory). >> >> Just check it with test to see if it works... >> >> >> 2014-09-18 13:20 GMT+02:00 XingFENG <xingf...@cse.unsw.edu.au>: >> >>> Thanks very much for your reply! >>> >>> To Sir Jeff Squyres: >>> >>> I think it fails due to truncation errors. I am now logging information >>> of each send and receive to find out the reason. >>> >>> >>> >>> >>> To Sir Nick Papior Andersen: >>> >>> Oh, wait (mpi_wait) is never called in my codes. >>> >>> What I do is to call MPI_Irecv once. Then MPI_Test is called several >>> times to check whether new messages are available. If new messages are >>> available, some functions to process these messages are called. >>> >>> I will add the wait function and check the running results. >>> >>> On Thu, Sep 18, 2014 at 8:47 PM, Nick Papior Andersen < >>> nickpap...@gmail.com> wrote: >>> >>>> In complement to Jeff, I would add that using asynchronous messages >>>> REQUIRES that you wait (mpi_wait) for all messages at some point. Even >>>> though this might not seem obvious it is due to memory allocation "behind >>>> the scenes" which are only de-allocated upon completion through a wait >>>> statement. >>>> >>>> >>>> 2014-09-18 12:36 GMT+02:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: >>>> >>>> >>>> On Sep 18, 2014, at 2:43 AM, XingFENG <xingf...@cse.unsw.edu.au> wrote: >>>>> >>>>> > a. How to get more information about errors? I got errors like >>>>> below. This says that program exited abnormally in function MPI_Test(). >>>>> But >>>>> is there a way to know more about the error? >>>>> > >>>>> > *** An error occurred in MPI_Test >>>>> > *** on communicator MPI_COMM_WORLD >>>>> > *** MPI_ERR_TRUNCATE: message truncated >>>>> > *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort >>>>> >>>>> For the purpose of this discussion, let's take a simplification that >>>>> you are sending and receiving the same datatypes (e.g., you're sending >>>>> MPI_INT and you're receiving MPI_INT). >>>>> >>>>> This error means that you tried to receive message with too small a >>>>> buffer. >>>>> >>>>> Specifically, MPI says that if you send a message that is X element >>>>> long (e.g., 20 MPI_INTs), then the matching receive must be Y elements, >>>>> where Y>=X (e.g., *at least* 20 MPI_INTs). If the receiver provides a Y >>>>> where Y<X, this is a truncation error. >>>>> >>>>> Unfortunately, Open MPI doesn't report a whole lot more information >>>>> about these kinds of errors than what you're seeing, sorry. >>>>> >>>>> > b. Are there anything to note about asynchronous communication? I >>>>> use MPI_Isend, MPI_Irecv, MPI_Test to implement asynchronous >>>>> communication. >>>>> My program works well on small data sets(10K nodes graphs), but it exits >>>>> abnormally on large data set (1M nodes graphs). >>>>> >>>>> Is it failing due to truncation errors, or something else? >>>>> >>>>> -- >>>>> Jeff Squyres >>>>> jsquy...@cisco.com >>>>> For corporate legal information go to: >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2014/09/25344.php >>>>> >>>> >>>> >>>> >>>> -- >>>> Kind regards Nick >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/09/25345.php >>>> >>> >>> >>> >>> -- >>> Best Regards. >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/09/25346.php >>> >> >> >> >> -- >> Kind regards Nick >> >> >> _______________________________________________ >> users mailing listus...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/09/25347.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/09/25348.php >> > > > > -- > Kind regards Nick > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/09/25349.php > -- Best Regards.