OK, so the various fixes to get CVS up to successfully loading the clients on RH9 have been committed. I haven't yet looked at why the cluster test is failing -- but, you can now see the failure with the --wait processing in an END block.
NOTE: Jeff, I made extra work for you by temporarily hacking packages/lam/config.xml to avoid the missing GM dependency :-/ -- David N. Lombard My comments represent my opinions, not those of Intel Corporation. >-----Original Message----- >From: Jeff Squyres [mailto:[EMAIL PROTECTED] >Sent: Sunday, July 11, 2004 6:18 AM >To: Lombard, David N >Cc: OSCAR-devel >Subject: Re: [Oscar-devel] Current version, the saga continues... > >Dave -- > >This all sounds most excellent. > >Any chance you can commit all your recent fixes to CVS in the near future >(even if it's still a work in progress)? > >Many thanks! > > >On Thu, 8 Jul 2004, Lombard, David N wrote: > >> In the last installment, I had appeared to have successfully completed >> "build OSCAR client image" and failed at "define OSCAR clients". In >> order to successfully complete the client definition, I had to make the >> following mods: >> >> 1) The client image was missing /opt/mpich-1.2.5.10-ch_p4-gcc/bin. This >> was caused by a missing <rpmlist> block in packages/mpich/config.xml. >> Adding the block corrected the initial failure to include any mpich >> RPMs. >> >> 2) Once the mpich RPMs were included, update-rpms failed to install >> mpich as it incorrectly reported a missing dependency. A correction to >> update-rpms then allowed the mpich RPMs to be installed. >> >> As of now, I can successfully complete the "Setup networking..." step. >> However, the clients will not build because the >> /tftpboot/pxelinux.cfg/default file is incorrect, it's a "localboot" >> file, not the expected SI kernel/initrd download file. I manually >> copied /etc/systemimager/pxelinux.cfg/syslinux.cfg to >> /tftpboot/pxelinux.cfg/default; this allowed the client to boot and >> build. I will report this problem on the systemimager list, it may be >> an SI problem or a failure to update oscar (scripts/setup_pxe) for the >> current SI packages. >> >> Having successfully built the client, the "complete cluster setup" >> reported a successful completion, however, the "test cluster setup" >> failed. The first time the cluster test was run, it failed and >> immediately closed the xterm window after only printing a couple of >> lines; on the second run, the test hung with: >> >> SSH server->node [EMAIL PROTECTED]'s password: >> >> This failure occurred because oscartst does not exist as a user on >> oscarnode1, i.e., /etc/passwd was not properly updated. >> >> Beyond that, >> - ssh as oscartst to the headnode worked properly >> - ssh as root to oscarnode1 worked properly >> - /home was NFS mounted from the headnode. >> >> Back to the salt mines ;-) >> >> -- >> David N. Lombard >> >> My comments represent my opinions, not those of Intel Corporation. >> >> >> ------------------------------------------------------- >> This SF.Net email sponsored by Black Hat Briefings & Training. >> Attend Black Hat Briefings & Training, Las Vegas July 24-29 - >> digital self defense, top technical experts, no vendor pitches, >> unmatched networking opportunities. Visit www.blackhat.com >> _______________________________________________ >> Oscar-devel mailing list >> [EMAIL PROTECTED] >> https://lists.sourceforge.net/lists/listinfo/oscar-devel >> > >-- >{+} Jeff Squyres >{+} [EMAIL PROTECTED] >{+} http://www.lam-mpi.org/ ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Oscar-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-devel
