I agree with Ralph, this code should work fine (we do this internally in orte_ras_base_node_query()). You may try adding a 'dump' of the GPR to make sure that the node segment has information on it. Add a call like the following to your function:
orte_gpr.dup_segment(NULL); or better yet orte_gpr.dump_segment(ORTE_NODE_SEGMENT); that should print out the node segment that it would be reading from. This may be a problem elsewhere, and this will help us pinpoint it. Cheers, Josh > I'm running this on my mac where I expected to only get back the > localhost. I upgraded to 1.0.2 a little while back, had been using one > of the alphas (I think it was alpha 9 but I can't be sure) up until that > point when this function returned '1' on my mac. > > -- Nathan > Correspondence > --------------------------------------------------------------------- > Nathan DeBardeleben, Ph.D. > Los Alamos National Laboratory > Parallel Tools Team > High Performance Computing Environments > phone: 505-667-3428 > email: ndeb...@lanl.gov > --------------------------------------------------------------------- > > > > Ralph H Castain wrote: >> Rc=0 indicates that the "get" function was successful, so this means >> that >> there were no nodes on the NODE_SEGMENT. Were you running this in an >> environment where nodes had been allocated to you? Or were you expecting >> to >> find only "localhost" on the segment? >> >> I'm not entirely sure, but I don't believe there have been significant >> changes in 1.0.2 for some time. My guess is that something has changed >> on >> your system as opposed to in the OpenMPI code you're using. Did you do >> an >> update recently and then begin seeing this behavior? Your revision level >> is >> 1000+ behind the current repository, so my guess is that you haven't >> updated >> for awhile - since 1.0.2 is under maintenance for bugs only, that >> shouldn't >> be a problem. I'm just trying to understand why your function is doing >> something different if the OpenMPI code your using hasn't changed. >> >> Ralph >> >> >> >> On 7/5/06 2:40 PM, "Nathan DeBardeleben" <ndeb...@lanl.gov> wrote: >> >> >>>> Open MPI: 1.0.2 >>>> Open MPI SVN revision: r9571 >>>> >>> The rc value returned by the 'get' call is '0'. >>> All I'm doing is calling init with my own daemon name, it's coming up >>> fine, then I immediately call this to figure out how many nodes are >>> associated with this machine. >>> >>> -- Nathan >>> Correspondence >>> --------------------------------------------------------------------- >>> Nathan DeBardeleben, Ph.D. >>> Los Alamos National Laboratory >>> Parallel Tools Team >>> High Performance Computing Environments >>> phone: 505-667-3428 >>> email: ndeb...@lanl.gov >>> --------------------------------------------------------------------- >>> >>> >>> >>> Ralph H Castain wrote: >>> >>>> Hi Nathan >>>> >>>> Could you tell us which version of the code you are using, and print >>>> out the >>>> rc value that was returned by the "get" call? I see nothing obviously >>>> wrong >>>> with the code, but much depends on what happened prior to this call >>>> too. >>>> >>>> BTW: you might want to release the memory stored in the returned >>>> values - it >>>> could represent a substantial memory leak. >>>> >>>> Ralph >>>> >>>> >>>> >>>> On 7/5/06 9:28 AM, "Nathan DeBardeleben" <ndeb...@lanl.gov> wrote: >>>> >>>> >>>> >>>>> I used to use this code to get the number of nodes in a cluster / >>>>> machine / whatever: >>>>> >>>>> >>>>>> int >>>>>> get_num_nodes(void) >>>>>> { >>>>>> int rc; >>>>>> size_t cnt; >>>>>> orte_gpr_value_t **values; >>>>>> >>>>>> rc = orte_gpr.get(ORTE_GPR_KEYS_OR|ORTE_GPR_TOKENS_OR, >>>>>> ORTE_NODE_SEGMENT, NULL, NULL, &cnt, >>>>>> &values); >>>>>> >>>>>> if(rc != ORTE_SUCCESS) { >>>>>> return 0; >>>>>> } >>>>>> >>>>>> return cnt; >>>>>> } >>>>>> >>>>>> >>>>> This now returns '0' on my MAC when it used to return 1. Is this not >>>>> an >>>>> acceptable way of doing this? Is there a cleaner / better way these >>>>> days? >>>>> >>>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> >>>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >