Re: [OMPI devel] Using external libevent
On May 1, 2013, at 7:32 PM, Orion Poplawskiwrote: > On 04/29/2013 11:04 AM, Ralph Castain wrote: >> >> On Apr 27, 2013, at 7:37 PM, Orion Poplawski wrote: >> >>> On 04/26/2013 08:53 PM, Ralph Castain wrote: On Apr 26, 2013, at 7:40 PM, Orion Poplawski wrote: > > So it looks like I will need to shortly be looking at how to link against > an external libevent. Any help with that would be greatly appreciated. As I said, I'll take a look at it, but can't commit to having it available any time soon. It isn't something I would suggest someone try who isn't fully versed in OMPI's code base. >>> >>> Yeah, I'm not looking forward to it. I get to at least wait until the >>> non-threaded version of libevent is available. >> >> I hate to see someone suffer, so I went ahead and added the external >> libevent connection this morning. Not trivial, but it seems to work. It is >> in our developer's trunk if you want to test it. As Jeff has said, we would >> prefer you not do this until the 1.9 series is released, and we won't be >> porting this change to the 1.7 series anyway. >> >> Just put it in so we can begin the investigation, and we always appreciate >> input and help in exploring the impacts! >> Ralph > > Great! I'll try to take a look at next week. You might wait a bit - Jeff is working on corner cases for it, so things will likely change. I'm not sure when he expects to finish. > > I noticed another message about using a threaded libevent after all on the > devel list. What is the status of that? Do we still need to produce a > non-threaded libevent in Fedora? I would hold off. I've been running some tests, and it looks to me like it punishes TCP messaging, but not too much (around 1%). Can't vouch that there won't be other problems, but it may prove to be okay. Let's see what happens once Jeff completes his work. > > Thanks again. > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA/CoRA DivisionFAX: 303-415-9702 > 3380 Mitchell Lane or...@cora.nwra.com > Boulder, CO 80301 http://www.cora.nwra.com > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Using external libevent
On 04/29/2013 11:04 AM, Ralph Castain wrote: On Apr 27, 2013, at 7:37 PM, Orion Poplawskiwrote: On 04/26/2013 08:53 PM, Ralph Castain wrote: On Apr 26, 2013, at 7:40 PM, Orion Poplawski wrote: So it looks like I will need to shortly be looking at how to link against an external libevent. Any help with that would be greatly appreciated. As I said, I'll take a look at it, but can't commit to having it available any time soon. It isn't something I would suggest someone try who isn't fully versed in OMPI's code base. Yeah, I'm not looking forward to it. I get to at least wait until the non-threaded version of libevent is available. I hate to see someone suffer, so I went ahead and added the external libevent connection this morning. Not trivial, but it seems to work. It is in our developer's trunk if you want to test it. As Jeff has said, we would prefer you not do this until the 1.9 series is released, and we won't be porting this change to the 1.7 series anyway. Just put it in so we can begin the investigation, and we always appreciate input and help in exploring the impacts! Ralph Great! I'll try to take a look at next week. I noticed another message about using a threaded libevent after all on the devel list. What is the status of that? Do we still need to produce a non-threaded libevent in Fedora? Thanks again. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA/CoRA DivisionFAX: 303-415-9702 3380 Mitchell Lane or...@cora.nwra.com Boulder, CO 80301 http://www.cora.nwra.com
[hwloc-devel] hwloc-1.7 woes
One more issue with hwloc-1.7 on the mac. http://git.mpich.org/mpich.git/commitdiff/d9a67f40 This showed up when we did a strict build of mpich. I believe this can be reproduced with "-Wall -Werror -O2", but I can find the exact set of minimum required flags, if needed. -- Pavan -- Pavan Balaji http://www.mcs.anl.gov/~balaji
Re: [OMPI devel] [OMPI svn] svn:open-mpi r28435 - in trunk: . conf db db/revprops db/revprops/0 db/revs db/revs/0 db/transactions db/txn-protorevs hooks locks
Nevermind. Figured it out. -Nathan On Wed, May 01, 2013 at 10:06:08AM -0600, Nathan Hjelm wrote: > *&&*$# . Can someone undo this. > > -Nathan > > On Wed, May 01, 2013 at 12:01:48PM -0400, svn-commit-mai...@open-mpi.org > wrote: > > Author: hjelmn (Nathan Hjelm) > > Date: 2013-05-01 12:01:48 EDT (Wed, 01 May 2013) > > New Revision: 28435 > > URL: https://svn.open-mpi.org/trac/ompi/changeset/28435 > > > > Log: > > import > > > > Added: > >trunk/README.txt > >trunk/conf/ > >trunk/conf/authz > >trunk/conf/passwd > >trunk/conf/svnserve.conf > >trunk/db/ > >trunk/db/current > >trunk/db/format > >trunk/db/fs-type > >trunk/db/fsfs.conf > >trunk/db/min-unpacked-rev > >trunk/db/revprops/ > >trunk/db/revprops/0/ > >trunk/db/revprops/0/0 > >trunk/db/revs/ > >trunk/db/revs/0/ > >trunk/db/revs/0/0 > >trunk/db/transactions/ > >trunk/db/txn-current > >trunk/db/txn-current-lock > >trunk/db/txn-protorevs/ > >trunk/db/uuid > >trunk/db/write-lock > >trunk/format > >trunk/hooks/ > >trunk/hooks/post-commit.tmpl > >trunk/hooks/post-lock.tmpl > >trunk/hooks/post-revprop-change.tmpl > >trunk/hooks/post-unlock.tmpl > >trunk/hooks/pre-commit.tmpl > >trunk/hooks/pre-lock.tmpl > >trunk/hooks/pre-revprop-change.tmpl > >trunk/hooks/pre-unlock.tmpl > >trunk/hooks/start-commit.tmpl > >trunk/locks/ > >trunk/locks/db-logs.lock > >trunk/locks/db.lock > > > > > > Diff not shown due to size (32936 bytes). > > To see the diff, run the following command: > > > > svn diff -r 28434:28435 --no-diff-deleted > > > > ___ > > svn mailing list > > s...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/svn > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [OMPI svn] svn:open-mpi r28435 - in trunk: . conf db db/revprops db/revprops/0 db/revs db/revs/0 db/transactions db/txn-protorevs hooks locks
*&&*$# . Can someone undo this. -Nathan On Wed, May 01, 2013 at 12:01:48PM -0400, svn-commit-mai...@open-mpi.org wrote: > Author: hjelmn (Nathan Hjelm) > Date: 2013-05-01 12:01:48 EDT (Wed, 01 May 2013) > New Revision: 28435 > URL: https://svn.open-mpi.org/trac/ompi/changeset/28435 > > Log: > import > > Added: >trunk/README.txt >trunk/conf/ >trunk/conf/authz >trunk/conf/passwd >trunk/conf/svnserve.conf >trunk/db/ >trunk/db/current >trunk/db/format >trunk/db/fs-type >trunk/db/fsfs.conf >trunk/db/min-unpacked-rev >trunk/db/revprops/ >trunk/db/revprops/0/ >trunk/db/revprops/0/0 >trunk/db/revs/ >trunk/db/revs/0/ >trunk/db/revs/0/0 >trunk/db/transactions/ >trunk/db/txn-current >trunk/db/txn-current-lock >trunk/db/txn-protorevs/ >trunk/db/uuid >trunk/db/write-lock >trunk/format >trunk/hooks/ >trunk/hooks/post-commit.tmpl >trunk/hooks/post-lock.tmpl >trunk/hooks/post-revprop-change.tmpl >trunk/hooks/post-unlock.tmpl >trunk/hooks/pre-commit.tmpl >trunk/hooks/pre-lock.tmpl >trunk/hooks/pre-revprop-change.tmpl >trunk/hooks/pre-unlock.tmpl >trunk/hooks/start-commit.tmpl >trunk/locks/ >trunk/locks/db-logs.lock >trunk/locks/db.lock > > > Diff not shown due to size (32936 bytes). > To see the diff, run the following command: > > svn diff -r 28434:28435 --no-diff-deleted > > ___ > svn mailing list > s...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/svn
Re: [OMPI devel] MPI_Mrecv(..., MPI_STATUS_IGNORE) in Open MPI 1.7.1
Right you are -- many thanks for finding the issue. I just committed a fix to the trunk in SVN r28430; I'll CMR it over to v1.7. On May 1, 2013, at 4:56 AM, Lisandro Dalcinwrote: > It seems that Mrecv() tries to write on the status arg, even when it > is STATUS_IGNORE. Looking at the sources (pmrecv.c and pmprobe.c), > there are some memcheck code paths that access status but do not check > for STATUS_IGNORE, please review them. > > $ cat tmp.c > #include > > int main(int argc, char *argv[]) > { > MPI_Message message; > MPI_Init(, ); > message = MPI_MESSAGE_NO_PROC; > MPI_Mrecv(NULL, 0, MPI_BYTE, , MPI_STATUS_IGNORE); > MPI_Finalize(); > return 0; > } > > $ mpicc tmp.c > $ valgrind ./a.out > ... > ==17489== > ==17489== Invalid write of size 8 > ==17489==at 0x4CA811C: PMPI_Mrecv (pmrecv.c:62) > ==17489==by 0x400816: main (in /tmp/a.out) > ==17489== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==17489== > [localhost:17489] *** Process received signal *** > [localhost:17489] Signal: Segmentation fault (11) > [localhost:17489] Signal code: Address not mapped (1) > [localhost:17489] Failing at address: (nil) > ... > > > -- > Lisandro Dalcin > --- > CIMEC (INTEC/CONICET-UNL) > Predio CONICET-Santa Fe > Colectora RN 168 Km 472, Paraje El Pozo > 3000 Santa Fe, Argentina > Tel: +54-342-4511594 (ext 1011) > Tel/Fax: +54-342-4511169 > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27880 - trunk/ompi/request
George, As I wrote in the ticket a few minutes ago, your patch looks good and it passed my test. My previous patch didn't care about generalized requests so your patch is better. Thanks, Takahiro Kawashima, from my home > Takahiro, > > I went over this ticket and attached a new patch. Basically I went over all > the possible cases, both in test and wait, and ensure the behavior is always > consistent. Please give it a try, and let us know of the outcome. > > Thanks, > George. > > > > On Jan 25, 2013, at 00:53 , "Kawashima, Takahiro" >wrote: > > > Jeff, > > > > I've filed the ticket. > > https://svn.open-mpi.org/trac/ompi/ticket/3475 > > > > Thanks, > > Takahiro Kawashima, > > MPI development team, > > Fujitsu > > > >> Many thanks for the summary! > >> > >> Can you file tickets about this stuff against 1.7? Included your patches, > >> etc. > >> > >> These are pretty obscure issues and I'm ok not fixing them in the 1.6 > >> branch (unless someone has a burning desire to get them fixed in 1.6). > >> > >> But we should properly track and fix these in the 1.7 series. I'd mark > >> them as "critical" so that they don't get lost in the wilderness of other > >> bugs. > >> > >> Sent from my phone. No type good. > >> > >> On Jan 22, 2013, at 8:57 PM, "Kawashima, Takahiro" > >> wrote: > >> > >>> George, > >>> > >>> I reported the bug three months ago. > >>> Your commit r27880 resolved one of the bugs reported by me, > >>> in another approach. > >>> > >>> http://www.open-mpi.org/community/lists/devel/2012/10/11555.php > >>> > >>> But other bugs are still open. > >>> > >>> "(1) MPI_SOURCE of MPI_Status for a null request must be MPI_ANY_SOURCE." > >>> in my previous mail is not fixed yet. This can be fixed by my patch > >>> (ompi/mpi/c/wait.c and ompi/request/request.c part only) attached > >>> in my another mail. > >>> > >>> http://www.open-mpi.org/community/lists/devel/2012/10/11561.php > >>> > >>> "(2) MPI_Status for an inactive request must be an empty status." > >>> in my previous mail is partially fixed. MPI_Wait is fixed by your > >>> r27880. But MPI_Waitall and MPI_Testall should be fixed. > >>> Codes similar to your r27880 should be inserted to > >>> ompi_request_default_wait_all and ompi_request_default_test_all. > >>> > >>> You can confirm the fixes by the test program status.c attached in > >>> my previous mail. Run with -n 2. > >>> > >>> http://www.open-mpi.org/community/lists/devel/2012/10/11555.php > >>> > >>> Regards, > >>> Takahiro Kawashima, > >>> MPI development team, > >>> Fujitsu > >>> > To be honest it was hanging in one of my repos for some time. If I'm not > mistaken it is somehow related to one active ticket (but I couldn't find > the info). It might be good to push it upstream. > > George. > > On Jan 22, 2013, at 16:27 , "Jeff Squyres (jsquyres)" > wrote: > > > George -- > > > > Is there any reason not to CMR this to v1.6 and v1.7? > > > > > > On Jan 21, 2013, at 6:35 AM, svn-commit-mai...@open-mpi.org wrote: > > > >> Author: bosilca (George Bosilca) > >> Date: 2013-01-21 06:35:42 EST (Mon, 21 Jan 2013) > >> New Revision: 27880 > >> URL: https://svn.open-mpi.org/trac/ompi/changeset/27880 > >> > >> Log: > >> My understanding is that an MPI_WAIT() on an inactive request should > >> return the empty status (MPI 3.0 page 52 line 46). > >> > >> Text files modified: > >> trunk/ompi/request/req_wait.c | 3 +++ > >> > >> 1 files changed, 3 insertions(+), 0 deletions(-) > >> > >> Modified: trunk/ompi/request/req_wait.c > >> == > >> --- trunk/ompi/request/req_wait.cSat Jan 19 19:33:42 2013 > >> (r27879) > >> +++ trunk/ompi/request/req_wait.c2013-01-21 06:35:42 EST (Mon, 21 > >> Jan 2013)(r27880) > >> @@ -61,6 +61,9 @@ > >> } > >> if( req->req_persistent ) { > >> if( req->req_state == OMPI_REQUEST_INACTIVE ) { > >> +if (MPI_STATUS_IGNORE != status) { > >> +*status = ompi_status_empty; > >> +} > >> return OMPI_SUCCESS; > >> } > >> req->req_state = OMPI_REQUEST_INACTIVE;
[OMPI devel] MPI_Mrecv(..., MPI_STATUS_IGNORE) in Open MPI 1.7.1
It seems that Mrecv() tries to write on the status arg, even when it is STATUS_IGNORE. Looking at the sources (pmrecv.c and pmprobe.c), there are some memcheck code paths that access status but do not check for STATUS_IGNORE, please review them. $ cat tmp.c #include int main(int argc, char *argv[]) { MPI_Message message; MPI_Init(, ); message = MPI_MESSAGE_NO_PROC; MPI_Mrecv(NULL, 0, MPI_BYTE, , MPI_STATUS_IGNORE); MPI_Finalize(); return 0; } $ mpicc tmp.c $ valgrind ./a.out ... ==17489== ==17489== Invalid write of size 8 ==17489==at 0x4CA811C: PMPI_Mrecv (pmrecv.c:62) ==17489==by 0x400816: main (in /tmp/a.out) ==17489== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==17489== [localhost:17489] *** Process received signal *** [localhost:17489] Signal: Segmentation fault (11) [localhost:17489] Signal code: Address not mapped (1) [localhost:17489] Failing at address: (nil) ... -- Lisandro Dalcin --- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169