Interesting - still, I see no reason for OMPI to fail just because of that. We can run just fine with the uid, so I'll make things a little more flexible.
Thanks for tracking it down! On Jan 22, 2014, at 7:54 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Not lacking getpwuid(): > > [phh1@biou2 BLD]$ grep HAVE_GETPWUID */include/*_config.h > opal/include/opal_config.h:#define HAVE_GETPWUID 1 > > I also can't see why the quoted code could fail. > The following is working fine: > > [phh1@biou2 BLD]$ cat q.c > #include <stdio.h> > #include <unistd.h> > #include <sys/types.h> > #include <pwd.h> > int main(void) { > uid_t uid = getuid(); > printf("uid = %d\n", (int)uid); > struct passwd *p = getpwuid(uid); > if (p) printf("name = %s\n", p->pw_name); > return 0; > } > > [phh1@biou2 BLD]$ gcc -std=c99 q.c && ./a.out > uid = 44154 > name = phh1 > > HOWEVER, building for ILP32 target (as in the reported failure) fails: > > [phh1@biou2 BLD]$ gcc -m32 -std=c99 q.c && ./a.out > uid = 44154 > > So, I am going to guess that this *is* a system misconfiguration (maybe > missing the 32-bit foo.so for the appropriate nsswitch resolver?) just as the > error message said. > > Sorry for the false alarm, > -Paul > > > On Wed, Jan 22, 2014 at 7:36 PM, Ralph Castain <r...@open-mpi.org> wrote: > Here is the offending code: > > /* get the name of the user */ > uid = getuid(); > #ifdef HAVE_GETPWUID > pwdent = getpwuid(uid); > #else > pwdent = NULL; > #endif > if (NULL != pwdent) { > user = strdup(pwdent->pw_name); > } else { > orte_show_help("help-orte-runtime.txt", > "orte:session:dir:nopwname", true); > return ORTE_ERR_OUT_OF_RESOURCE; > } > > Is it possible on this platform that you don't have getpwuid? I'm surprised > at the code as we could just use the uid instead - not sure why this more > stringent test was applied > > > > On Jan 22, 2014, at 7:02 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > >> On yet another test platform I see the following: >> >> $ mpirun -mca btl sm,self -np 1 examples/ring_c >> -------------------------------------------------------------------------- >> Open MPI was unable to obtain the username in order to create a path >> for its required temporary directories. This type of error is usually >> caused by a transient failure of network-based authentication services >> (e.g., LDAP or NIS failure due to network congestion), but can also be >> an indication of system misconfiguration. >> >> Please consult your system administrator about these issues and try >> again. >> -------------------------------------------------------------------------- >> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file >> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/util/session_dir.c >> at line 380 >> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file >> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/mca/ess/hnp/ess_hnp_module.c >> at line 599 >> -------------------------------------------------------------------------- >> It looks like orte_init failed for some reason; your parallel process is >> likely to abort. There are many reasons that a parallel process can >> fail during orte_init; some of which are due to configuration or >> environment problems. This failure appears to be an internal failure; >> here's some additional information (which may only be relevant to an >> Open MPI developer): >> >> orte_session_dir failed >> --> Returned value Out of resource (-2) instead of ORTE_SUCCESS >> -------------------------------------------------------------------------- >> >> >> An "-np 2" run fails in the same manner. >> This is a production system and there is no problem with "whoami" or "id", >> leaving me doubting the explanation provided by the error message. >> >> [phh1@biou2 ~]$ whoami >> phh1 >> [phh1@biou2 ~]$ id >> uid=44154(phh1) gid=2016(hpc) >> groups=2016(hpc),3803(hpcusers),3805(sshgw),3808(biou) >> >> The "ompi_info --all" output is attached. >> Please let me know what additional info is needed. >> >> -Paul >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> <biou2_info.txt.bz2>_______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel