OK, so the various fixes to get CVS up to successfully loading the
clients on RH9 have been committed.  I haven't yet looked at why the
cluster test is failing -- but, you can now see the failure with the
--wait processing in an END block.

NOTE: Jeff, I made extra work for you by temporarily hacking
packages/lam/config.xml to avoid the missing GM dependency :-/

-- 
David N. Lombard
 
My comments represent my opinions, not those of Intel Corporation.

>-----Original Message-----
>From: Jeff Squyres [mailto:[EMAIL PROTECTED]
>Sent: Sunday, July 11, 2004 6:18 AM
>To: Lombard, David N
>Cc: OSCAR-devel
>Subject: Re: [Oscar-devel] Current version, the saga continues...
>
>Dave --
>
>This all sounds most excellent.
>
>Any chance you can commit all your recent fixes to CVS in the near
future
>(even if it's still a work in progress)?
>
>Many thanks!
>
>
>On Thu, 8 Jul 2004, Lombard, David N wrote:
>
>> In the last installment, I had appeared to have successfully
completed
>> "build OSCAR client image" and failed at "define OSCAR clients". In
>> order to successfully complete the client definition, I had to make
the
>> following mods:
>>
>> 1) The client image was missing /opt/mpich-1.2.5.10-ch_p4-gcc/bin.
This
>> was caused by a missing <rpmlist> block in packages/mpich/config.xml.
>> Adding the block corrected the initial failure to include any mpich
>> RPMs.
>>
>> 2) Once the mpich RPMs were included, update-rpms failed to install
>> mpich as it incorrectly reported a missing dependency.  A correction
to
>> update-rpms then allowed the mpich RPMs to be installed.
>>
>> As of now, I can successfully complete the "Setup networking..."
step.
>> However, the clients will not build because the
>> /tftpboot/pxelinux.cfg/default file is incorrect, it's a "localboot"
>> file, not the expected SI kernel/initrd download file.  I manually
>> copied /etc/systemimager/pxelinux.cfg/syslinux.cfg to
>> /tftpboot/pxelinux.cfg/default; this allowed the client to boot and
>> build.  I will report this problem on the systemimager list, it may
be
>> an SI problem or a failure to update oscar (scripts/setup_pxe) for
the
>> current SI packages.
>>
>> Having successfully built the client, the "complete cluster setup"
>> reported a successful completion, however, the "test cluster setup"
>> failed.  The first time the cluster test was run, it failed and
>> immediately closed the xterm window after only printing a couple of
>> lines; on the second run, the test hung with:
>>
>>  SSH server->node        [EMAIL PROTECTED]'s password:
>>
>> This failure occurred because oscartst does not exist as a user on
>> oscarnode1, i.e., /etc/passwd was not properly updated.
>>
>> Beyond that,
>> - ssh as oscartst to the headnode worked properly
>> - ssh as root to oscarnode1 worked properly
>> - /home was NFS mounted from the headnode.
>>
>> Back to the salt mines ;-)
>>
>> --
>> David N. Lombard
>>
>> My comments represent my opinions, not those of Intel Corporation.
>>
>>
>> -------------------------------------------------------
>> This SF.Net email sponsored by Black Hat Briefings & Training.
>> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
>> digital self defense, top technical experts, no vendor pitches,
>> unmatched networking opportunities. Visit www.blackhat.com
>> _______________________________________________
>> Oscar-devel mailing list
>> [EMAIL PROTECTED]
>> https://lists.sourceforge.net/lists/listinfo/oscar-devel
>>
>
>--
>{+} Jeff Squyres
>{+} [EMAIL PROTECTED]
>{+} http://www.lam-mpi.org/



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Oscar-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to