Hi DongInn, On Monday 16 April 2007 11:22:38 pm DongInn Kim wrote: > Hi Ivan, > > Ivan Adzhubey wrote: > > Hi Michael, > > > > Ok, I downloaded and put all FC5 updates on the server node, updated > > local fedora-5 repository, regenerated oscarimage and reimaged nodes. > > This time it worked! Not a single rsync timeout anymore. So it was a > > rsync bug, perhaps triggered by a particular combination of > > hardware/kernel/drivers. > > > > Now I am having another problem, with SGE configuration on slave nodes > > picking up the wrong server name. My server node has two network > > interfaces, private and public one, pretty standard configuration. > > Machine hostname is set to reflect public interface, not private one. > > Now, SGE install scripts set qmaster hostname to that public name and SGE > > execd daemons on slave nodes obviously fail to contact qmaster by this > > name since they have no access to the public network. Easy to fix > > manually but still I think this is a bug... > > Are you installing the official OSCAR 5.0 on FC5? If so, can you please > try to test it with trunk? > A network configuration is a little bit polished and it may fix your > problem.
That's right, I used 5.0 release. I did think about giving 5.1 a try at some point, after all the frustration with (non)rsyncing images, but eventually went another route and upgraded FC5 instead... How close is 5.1 to release status in your opinion? I was given a generous 3 days to upgrade our 24-node cluster from OSCAR 4.0/FC2 to OSCAR5.0/FC5 and I have already spent 6 days, so I am a bit short on time right now. However, I have a spare head node and can borrow a couple of slave nodes for experiments so I might still try 5.1 as well. > > On Monday 16 April 2007 06:16:16 pm Michael Edwards wrote: > >> When you update rsync on the head node also update it on the image. > >> > >> copy the rpm into the image directory (probably something like > >> /var/lib/systemimager/images/oscarimage/tmp) > >> chroot /var/lib/systemimager/images/oscarimage > >> install the rpm from /tmp in the new environment > >> exit the chroot and try imaging the nodes > >> > >> I think there are some scripts to do this as well, but I haven't gotten > >> around to learning how to use them yet :) > > > > There are indeed, and reading the documentation can sometimes save you a > > lot of time ;-). That said, OSCAR documentation is horrible. Not only it > > is incomplete and fragmented, it also has a good number of typographic > > errors in examples, that makes it look more like a puzzle. It was not > > especially difficult to solve but I wonder why these outstanding errors > > are still in there? It would take 5 minutes to fix them and regenerate > > the PDF file... > > Can you please forward to oscar-users@ list or me or more preferably > Mike anything wrong you found in our documents? Please forgive me for my > laziness. I am sure that your keen eyes will save a lot of time in > updating the documents. Hmm, wouldn't it be easier to do it directly on wiki pages? I thought that's what it is supposed to be for. By the way, some of the errors (not all of them) are already corrected in main OSCAR wiki, so I was just wondering why official documentation was not at least synced... Cheers, Ivan ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users