Jason
At 02:15 PM 6/9/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
No errors. The mptscsih driver that's included works fine, it just doesn't load after a network install.
- -----Original Message-----
- From: Jason B. [mailto:[EMAIL PROTECTED]]
- Sent: Monday, June 09, 2003 2:14 PM
- To: Antonio M. Ferreira, Ph.D.
- Subject: RE: [Oscar-users] Nodes fail to boot after network install ... continued
- Are there any errors during the client installation? Do your machines require a special driver for their SCSI controller, or does the included driver work properly?
- Jason
- At 02:02 PM 6/9/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
- One was created on the Master node (the one that fails to load the SCSI module) and the other was created on a working client.
- -----Original Message-----
- From: Jason B. [mailto:[EMAIL PROTECTED]]
- Sent: Monday, June 09, 2003 2:01 PM
- To: Antonio M. Ferreira, Ph.D.
- Subject: RE: [Oscar-users] Nodes fail to boot after network install ... continued
- What is the difference between the images?
- Jason
- At 01:56 PM 6/9/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
- Jason,
- I have two images that I'm playing with right now. One gets to the "LI" of LILO and hangs, the other seems to boot properly, but misses the mptscsih drivers for some reason and then complains that it can't find the "init" and gives a kernel panic error.
- Tony
- -----Original Message-----
- From: Jason B. [mailto:[EMAIL PROTECTED]]
- Sent: Monday, June 09, 2003 10:34 AM
- To: Antonio M. Ferreira, Ph.D.
- Subject: RE: [Oscar-users] Nodes fail to boot after network install ... continued
- If they are dual-processor boxes, then you SHOULD use the smp kernel. At what point in the boot process do the clients fail?
- Jason
- At 10:06 AM 6/9/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
- Jason,
- All of the nodes are dual-processor boxes. I'll try the LILO trick and see if that helps.
- Tony
- -----Original Message-----
- From: Jason B. [mailto:[EMAIL PROTECTED]]
- Sent: Monday, June 09, 2003 9:44 AM
- To: Antonio M. Ferreira, Ph.D.
- Subject: RE: [Oscar-users] Nodes fail to boot after network install ... continued
- Do your clients have a single CPU or multiple CPUs? If they have a single CPU, you should try forcing them to boot with the uni-processor kernel. When the LILO prompt comes up, enter the UP kernel name. To get a list, press TAB, and the UP kernel should be something like 2.4.7-10 (NOT the one with smp at the end).
- If the above solution doesn't work:
- How far do the clients get in the boot process?
- Jason
- At 03:38 PM 6/6/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
- Jason,
- I agree that there's no need to install RedHat on the clients, but my problem has been that I can't seem to get a working image transferred to them. My purpose for doing the hand install was to see if I could use SystemImager to create an image of a working client and then install that image back onto the client from the master node ... it didn't work.
- My kernels all came off the same set of CDs, so there shouldn't be any compatibility issues there. I haven't modified the image (I assume you mean the image in /var/lib/systemimager/images) before the install. My network cards are on the motherboard. I'm running Dell Precision 450's, so it's a Planar motherboard with the Intel Pro 1000 NIC. My drivers for RedHat 7.3 came from a source at Intel.
- What continues to perplex me is that the image comes over from the master node just fine, it just won't boot once it's there. I'm wondering if it could be a problem with my partition table.
- Tony
- <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
- Antonio M. Ferreira, Ph.D.
- Department of Chemistry
- The University of Memphis
- J.M. Smith Chemistry Building
- 3744 Walker Avenue
- Memphis, Tennessee 38152
- e-mail: [EMAIL PROTECTED]
- Phone: (901) 678-2630
- Fax: (901) 678-3447
- He who will not risk cannot win
- ���- John Paul Jones
- I must go down to the seas again,
- to the lonely sea and the sky.
- And all I ask is a tall ship
- and a star to steer her by.
- - John Masefield
- Computational Research On Materials Institute at the University of Memphis
- -----Original Message-----
- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Jason B.
- Sent: Thursday, June 05, 2003 1:01 PM
- To: Antonio M. Ferreira, Ph.D.; [EMAIL PROTECTED]
- Subject: Re: [Oscar-users] Nodes fail to boot after network install ... continued
- First, there is no need to install RedHat on the clients, since the OSCAR installation process handles this step. Did you install a newer kernel on the clients? Did you modify the image filesystem before the clients were installed? Which network card do you have?
- Jason
- At 02:22 PM 6/3/2003 -0500, Antonio M. Ferreira, Ph.D. wrote:
- Hello again everyone ... here's what I've got.
- I started from scratch, reinstalled RedHat 7.3 on the Master node and all the clients. Our (hopefully soon-to-be) cluster consists of one single processor box for the Master node and seven dual-processor boxes for the compute nodes. I'm running the smp kernel on the Master node to avoid compatibility issues. There's one important note, however:
- I had to compile some drivers provided by Intel to get support for the on-board network cards. The drivers were installed in the usual manner with an "insmod" command. Everything seemed to work fine. The MAC collection worked without error as did the creation of the autoinstall boot floppy.
- When I reboot the nodes with the boot floppy everything seems to proceed normally. I can watch the network traffic come over from the Master node and all the files appear to be installed. However, when I remove the boot floppy and reboot the node, I get the following messages: (Does anyone know a way to scroll back and read the lines that have scrolled passed? I've tried the Ctrl-PageUp and Ctrl-B, but neither seems to work.)
- <long list of unresolved symbol errors regarding mptscsi.o>
- ERROR: /bin/insmod exited abnormally!
- mounting /proc filesystem
- Creating root device
- Mounting root filesystem
- kmod: failed to exec /sbin/modprobe -s -k block-major-8, errno = 2
- mount: error 6 mounting ext2
- pivotroot: pivot_root(/sysroot,/sysroot/initrd) failed: 2
- Freeing unused kernel memory: 384k freed
- Kernel panic: No init found. Try passing init= option to kernel.
- <the system hangs here>
- What's curious to me about this is that the same SCSI controller is in each of the machines, however the problem seems to be that the drivers for the SCSI controller are never loaded upon reboot after network installation.
- Should I create my network install image on one of the client nodes and port this to the Master node? If so, what set of RPMs do I need to include in my client installation? Also, has anyone done this before?
- Thanks for all of your help.
- Tony
- __________________________________________________________________________
- Antonio M. Ferreira,
- Ph.D.
- "He who will not risk cannot win."
- Department of Chemistry
- The University of
- Memphis
- - John Paul Jones
- J.M. Smith Chemistry Building
- 3744 Walker Avenue
- Memphis, Tennessee
- 38152
- I must go down to the seas again,
- to the lonely sea and the sky.
- And all I ask is a tall ship
- e-mail:
- [EMAIL PROTECTED]
- and a star to steer her by.
- Phone : (901) 678-2630
- Fax : (901)
- 678-3447
- - John Masefield
- __________________________________________________________________________
- Computational Research on
- Materials Institute at the University of
- Memphis
