On 6/19/08 3:31 PM, "Jeff Squyres" <[email protected]> wrote:
> Yo Ralph -- > > Is the "bad" grpcomm component both new and the default? Further, is > the old "basic" grpcomm component now the non-default / testing > component? Yes to both > > If so, I wonder if what happened was that Pasha did an "svn up", but > without re-running autogen/configure, he wouldn't have seen the new > "bad" component and therefore was falling back on the old "basic" > component that is now the non-default / testing component...? > Could be - though I thought that if you do a "make" in that situation, it would force a re-autogen/configure when it saw a new component? Of course, if he didn't do a "make" at the top level, and he is in a dynamic build, then maybe OMPI wouldn't figure out that something was different... Don't know - but we have had problems with svn missing things in the past too, so it could be a number of things. <shrug> > > On Jun 19, 2008, at 4:21 PM, Pavel Shamis (Pasha) wrote: > >> I did fresh check out and everything works well. >> So looks like some svn up screw my svn. >> Ralph, thanks for help ! >> >> Ralph H Castain wrote: >>> Hmmm...something isn't right, Pasha. There is simply no way you >>> should be >>> encountering this error. You are picking up the wrong grpcomm module. >>> >>> I went ahead and fixed the grpcomm/basic module, but as I note in >>> the commit >>> message, that is now an experimental area. The grpcomm/bad module >>> is the >>> default for that reason. >>> >>> Check to ensure you have the orte/mca/grpcomm/bad directory, and >>> that it is >>> getting built. My guess is that you have a corrupted checkout or >>> build and >>> that the component is either missing or not getting built. >>> >>> >>> On 6/19/08 1:37 PM, "Pavel Shamis (Pasha)" >>> <[email protected]> wrote: >>> >>> >>>> Ralph H Castain wrote: >>>> >>>>> I can't find anything wrong so far. I'm waiting in a queue on >>>>> Odin to try >>>>> there since Jeff indicated you are using rsh as a launcher, and >>>>> that's the >>>>> only access I have to such an environment. Guess Odin is being >>>>> pounded >>>>> because the queue isn't going anywhere. >>>>> >>>> I use ssh., here is command line: >>>> ./bin/mpirun -np 2 -H sw214,sw214 -mca btl openib,sm,self >>>> ./osu_benchmarks-3.0/osu_latency >>>> >>>>> Meantime, I'm building on RoadRunner and will test there (TM >>>>> enviro). >>>>> >>>>> >>>>> On 6/19/08 1:18 PM, "Pavel Shamis (Pasha)" <[email protected] >>>>>> wrote: >>>>> >>>>> >>>>>>> You'll have to tell us something more than that, Pasha. What >>>>>>> kind of >>>>>>> environment, what rev level were you at, etc. >>>>>>> >>>>>> Ahh, sorry :) I run on Linux x86_64 Sles10 sp1. (Open MPI) >>>>>> 1.3a1r18682M >>>>>> , OFED 1.3.1 >>>>>> Pasha. >>>>>> >>>>>>> So far as I know, the trunk is fine. >>>>>>> >>>>>>> >>>>>>> On 6/19/08 12:01 PM, "Pavel Shamis (Pasha)" <[email protected] >>>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>>> I tried to run trunk on my machines and I got follow error: >>>>>>>> >>>>>>>> [sw214:04367] [[16563,1],1] ORTE_ERROR_LOG: Data unpack would >>>>>>>> read past >>>>>>>> end of buffer in file base/grpcomm_base_modex.c at line 451 >>>>>>>> [sw214:04367] [[16563,1],1] ORTE_ERROR_LOG: Data unpack would >>>>>>>> read past >>>>>>>> end of buffer in file grpcomm_basic_module.c at line 560 >>>>>>>> [sw214:04365] >>>>>>>> ----------------------------------------------------------------------- >>>>>>>> --- >>>>>>>> It looks like MPI_INIT failed for some reason; your parallel >>>>>>>> process is >>>>>>>> likely to abort. There are many reasons that a parallel >>>>>>>> process can >>>>>>>> fail during MPI_INIT; some of which are due to configuration or >>>>>>>> environment >>>>>>>> problems. This failure appears to be an internal failure; >>>>>>>> here's some >>>>>>>> additional information (which may only be relevant to an Open >>>>>>>> MPI >>>>>>>> developer): >>>>>>>> >>>>>>>> orte_grpcomm_modex failed >>>>>>>> --> Returned "Data unpack would read past end of >>>>>>>> buffer" (-26) instead >>>>>>>> of "Success" (0) >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> [email protected] >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> [email protected] >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> [email protected] >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> [email protected] >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> >>>> _______________________________________________ >>>> devel mailing list >>>> [email protected] >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> [email protected] >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >> >> _______________________________________________ >> devel mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >
