Okay, I figured it out.  It's the kernel version.  I had downloaded every
update from the distribution which included the new kernel.  I guess rsync
does not like working kernel version 2.6.10-1.12.  I switch back to
2.6.5-1.358 and everything started loading right again.

I'm trying to decide of a way to update all my nodes.  What do some of you
do?  I was thinking of just using the same yum repository as I used for my
head node and just updated after a new image has installed.  I don't have
any experience with SIS and I'm afraid I'd break something.



-----Original Message-----
From: Michael Edwards [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 14, 2005 3:01 PM
To: Johnston Michael J Contr AFRL/DES
Subject: Re: [Oscar-users] Problems with rsync and halting nodes

Did you just run yum on the head node, or did you update the image too?

I haven't actually updated yet myself, but I was going to see if I
could replicate (or avoid) your problems when I did since it was on my
short list of things to do.

Also, how did you "push" the image?  Did you use the SIS command or
did you just network boot (or use the install floppy)?  If you network
boot, the head node should use the same process it uses on install. 
Might be worth a try if you pushed it out explicitly with SIS.  I
haven't played with those utilities outside the wizard.

On Mon, 14 Feb 2005 20:58:04 -0000, Johnston Michael J Contr AFRL/DES
<[EMAIL PROTECTED]> wrote:
> 
> 
> My cluster has been running awesome, until I just updated all the Fedora 2
> RPM's using yum.  The install went great, but when I reload an image my
> nodes get half way through a rebuild and die.  I was having this exact
same
> problem before when I built up the head node updated it and then tried to
> push to the clients.  So I took someone's advice this time and built the
> cluster before adding any updates and it worked just fine.
> 
>  
> 
> Now when a node start rebuilding from the TFTP image, it starts load
> normally until it gets to the point that it's placing the files on the
> drive.  Then it stops at a random spot and crash out.  Here is the error
> that I'm getting:
> 
>  
> 
> ########################
> 
>  
> 
> Rsync: read error: Connection reset by peer
> 
> Rsync: error: error in rsync protocol data stream (code 12) at io.c(177)
> 
> Rsync: connection unexpectedly closed (1729062 bytes read so far)
> 
> Rsync: error in rsync protocol data stream (code 12) at io.c(165) Killing
> off running processes
> 
>  
> 
> #######################
> 
>  
> 
> I had noticed this error before so I did not update rsync with yum.  I
left
> that RPM out.  Any ideas what I can do to repair this?  I don't want to
> re-install this again.  It would be nice if there was a way to removed all
> the patches that I added to the head node, but there about 644 of them.
> 
>  
> 
> **oh, and apart from that problem, it seems like half the time I halt a
node
> it comes up un-cleaned after I restart it and I have to reload the image. 
> Is that normal?  Seem much pickier then a strait install.


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to