Hi, I am a new oscar user. I started setting up a cluster with oscar 6.0.2 on scientific linux 5.3 about two weeks ago.
After some difficulties with oscar on the headnode (which i guess mostly because scientific linux is not officially supported) I had this problem of nodes not booting after imaging. The mailing lists gave me the impression that the SAS disks in the compute nodes may be the problem (On the server I have a raid array). I tried all the solutions suggested... modeprobe.conf, YUOK etc but none helped. I finally had a look at the initrd image in the compute node and compared it with one after a direct scientific installation on that node. I found two main problems: the oscar initrd does not load any sata/sas drivers and it doesnt mount the root file systems correctly: mkrootdev -t -o defaults,ro So i manually edited the oscar initrd to make it similar to the one after scientific linux installation (loaded the correct drivers and added mkrootdev -t ext3 -o defaults,ro /dev/sda9) and used it to replace the one in the /var/lib/systemimager/images/oscarimage/boot directory) and imaged the nodes... and everything worked perfectly! Oscar tests somehow still does not work, but the cluster is fully functional otherwise. btw, I have same kernel in oscar image and headnode. To conclude, I think if systemimager is supposed to build the correct initrd image during imaging process, it doesnt do that. hope it helps regards shanavas On Thu, 2009-04-30 at 06:24 +0200, geoffroy.val...@free.fr wrote: > Ansgar, > > First of all, sorry for not replying earlier, it has been a busy day. > > If you do not mind, I still need more details about your exact situation. > > First of all, i know for sure that if your system has SATA drives and if you > use the default SystemImager kernel to image the compute nodes, you will most > certainly have issues; and again, this is mainly due to the fact that the > kernel used for imaging and the one used by CentOS, do not name the SATA > drives to same way. As a result, the imaging process is successful (SATA > drives are recognized as SCSI disks and the system is setup accordingly) but > when you reboot the compute nodes, the kernel (old version) actually > recognizes the SATA drives as IDE and therefore cannot boot (the system > configuration does not fit the configuration). > > So typically can you answer the following questions: > - Are the two kernels (OSCAR headnode and images) exactly the same, or are > they slightly different (i know you already gave me this information, i just > want to double-check that we are on the same page). > - CentOS is using a 2.6.18.x kernel and SystemImager a newer kernel; did you > use UYOK in order to use a more CentOS-like kernel for compute node > deployment? > > Sorry if i make you repeat what you may have already said, i try to clarify > the details so we could try to identify the exact problem together and fix it. > > Regards, > > ----- Mail Original ----- > De: "Ansgar Esztermann" <aesz...@gwdg.de> > À: oscar-users@lists.sourceforge.net > Envoyé: Mercredi 29 Avril 2009 06h00:43 GMT -05:00 USA/Canada - États de l'Est > Objet: Re: [Oscar-users] 6.0.2, 6.0.3 and broken initrd > > Geoffroy, > > On Apr 28, 2009, at 17:24 , geoffroy.val...@free.fr wrote: > > > Sorry but i cannot help you. You do not give me enough details, keep > > continue to say that OSCAR is broken even when i give you other > > points to investigate. Sorry i cannot do more based on the > > information i have and again, __ALL__ > > I am sorry if I did give that impression -- I do not claim that OSCAR > is broken; I am merely maintaining that the initrd created by the > kernel rpm is broken. As far as I understand, OSCAR or SIS are > supposed to create a new initrd. For some reason, this does not work > correctly with my machine. I would like to find out why and fix the > problem, but so far have been unable to. > Is there documentation available that explains how OSCAR works? I do > know that an image is created, and SystemImager is used to install it > to the compute nodes; but inbetween, it gets somewhat sketchy. For > example, how and where exactly is the new initrd created? I have taken > a look at various scripts, and it seems that an initrd is created > after rsync'ing the image to the node only if no initrd exists in the > image (but it does on my installation). Surely I am overlooking > something. > > > users how had similar issues (and i know about), the problem was > > because the kernel in the image is 2.6.18 based whereas the > > SystemImager kernel to image the node is > 2.6.19 (SATA naming issue > > and so on -- remember that selecting the SCSI template when creating > > the nodes may not be enough), or because the kernel shipped with > > RHEL was not working correctly on their platforms. > > Well, the kernel does work on my compute nodes if I manually create > the initrd; there may be some hidden conflict due to the 2.6.18/2.6.19 > conflict although I do not see how this would silently prevent an > initrd from being generated. > > > PS: for the kernel difference between the headnode and the compute > > nodes, we use online repositories for both of them. This is not an > > OSCAR issue, we do not enforce the version. So most certainly your > > headnode is not up-to-date or not using the same repositories, one > > of them being out-of-date. > > > OK, thank you for the suggestion. I have made sure the headnode has > the latest CentOS kernel. A new image is being created, and I will > report back to the list. > > > > Regards, > > A. > > -- > Ansgar Esztermann > DV-Systemadministration > Max-Planck-Institut für biophysikalische Chemie, Abteilung 105 > > > ------------------------------------------------------------------------------ > Register Now & Save for Velocity, the Web Performance & Operations > Conference from O'Reilly Media. Velocity features a full day of > expert-led, hands-on workshops and two days of sessions from industry > leaders in dedicated Performance & Operations tracks. Use code vel09scf > and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf > _______________________________________________ > Oscar-users mailing list > Oscar-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oscar-users > > ------------------------------------------------------------------------------ > Register Now & Save for Velocity, the Web Performance & Operations > Conference from O'Reilly Media. Velocity features a full day of > expert-led, hands-on workshops and two days of sessions from industry > leaders in dedicated Performance & Operations tracks. Use code vel09scf > and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf > _______________________________________________ > Oscar-users mailing list > Oscar-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oscar-users ------------------------------------------------------------------------------ Register Now & Save for Velocity, the Web Performance & Operations Conference from O'Reilly Media. Velocity features a full day of expert-led, hands-on workshops and two days of sessions from industry leaders in dedicated Performance & Operations tracks. Use code vel09scf and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users