Hey Jason: I would try to use a Knoppix live-CD to find out which nic you have on the new 1750s.
As for kernel files in tftpboot, just make sure that initrd.img and kernel are the same in both /tftpboot and /usr/share/systemimager/boot/i386/standard/ (i.e. the ones from Frank). Cheers, Bernard > -----Original Message----- > From: Jason Hlady [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 21, 2004 12:23 > To: Bernard Li > Cc: [EMAIL PROTECTED]; Jason Hlady; Frank Crawford > Subject: Re: [Oscar-users] RH9.0, Oscar3.0, Dell > 1750PowerEdge, tg3 install woes > > I just tried using the stock kernel files from OSCAR; > > I got very similar results: > > tg3.c: v1.2 (Nov 14, 2002) > tg3: Problem fetching invariants of chip, aborting > tg3: Problem fetching invariants of chip, aborting > sk98lin: no adapter found > > <stuff> > > VFS: Mounted root (cramfs filesystem) > Mounted devfs on /dev > Freeing unused kernel memory: 524k freed kernel NULL pointer > dereference at virtual address 00000000 kernel panic > > Yes, it is possible that the NICs that are in these 1750s > have been changed from the previous version of 1750s, without > them telling us, and possibly the NICs no longer work with > the drivers. Grrrr. The fact that the errors we are getting > are so similar to previous errors that other people have seen > (i.e. this tg3 invariants error), however, makes me wonder if > I'm just doing something wrong. Am I correct that making a > modification to /usr/share/systemimager/boot/i386/standard/* > (i.e. kernel, config, boel_binaries.gz) and then restarting > the "setup networking" should be sufficient? That is, I > don't actually need to modify the files in /tftpboot because > they will have been automatically changed by those two steps? > > Jason > > > > On Sep 21, 2004, at 12:01 PM, Bernard Li wrote: > > > Hi Jason: > > > > I have used Frank Crawford's files with a wide variety of > bcm57xx nics > > and they all work fine... I guess it's possible that you are using > > nics that don't work with the drivers? > > > > Have you also tried using the stock kernel files from OSCAR > and see if > > that gives you different error messages? > > > > Cheers, > > > > Bernard > > > >> -----Original Message----- > >> From: Jason Hlady [mailto:[EMAIL PROTECTED] > >> Sent: Tuesday, September 21, 2004 10:33 > >> To: Bernard Li > >> Cc: [EMAIL PROTECTED]; Jason Hlady; Frank Crawford > >> Subject: Re: [Oscar-users] RH9.0, Oscar3.0, Dell > 1750PowerEdge, tg3 > >> install woes > >> > >> > >> On Sep 21, 2004, at 11:10 AM, Bernard Li wrote: > >> > >>> Hey Jason: > >>> > >>> Just a quick question - are the specs of the 2 sets of > >> PowerEdge 1750 > >>> identical? It seems kind of odd that it worked on the original > >>> hardware but not the newer one, unless they have some subtle > >>> changes... Also, do they both have SCSI harddrives...? > >> > >> No, I imagine that the hardware is not exactly the same, simply > >> because it works on one and not the other. They both have > SCSI hard > >> drives (different size and speed, but that shouldn't matter), both > >> using the onboard NICs.... As to the exact specs, that's > something I > >> haven't actually tracked down, but given that it's having trouble > >> with the NICs (and possibly the SCSI system) I didn't check out > >> component-by-component what has changed. I certainly can do that. > >> > >> Jason > >> > >> > >>> > >>> Cheers, > >>> > >>> Bernard > >>> > >>>> -----Original Message----- > >>>> From: [EMAIL PROTECTED] > >>>> [mailto:[EMAIL PROTECTED] On > Behalf Of Jason > >>>> Hlady > >>>> Sent: Tuesday, September 21, 2004 9:52 > >>>> To: [EMAIL PROTECTED] > >>>> Cc: Frank Crawford; Jason Hlady > >>>> Subject: [Oscar-users] RH9.0, Oscar3.0, Dell 1750PowerEdge, > >>>> tg3 install woes > >>>> > >>>> Hi all, > >>>> > >>>> This is an especially frustrating help letter to have to > write. :) > >>>> I will explain why: > >>>> > >>>> 1) I have successfully used (December '03) OSCAR 3.0 on > Redhat 9.0 > >>>> to install on 32 Dell PowerEdge 1750 servers. > >>>> > >>>> 2) Recently, another researcher has purchased a 64 node > cluster of > >>>> Dell PowerEdge 1750 servers (the servers arrived > September '04) and > >>>> I am setting this cluster up using RH9 and OSCAR 3.0. I am > >>>> attempting to use the exact same configuration as I used for MY > >>>> cluster, which is happily running OSCAR right now. > >>>> > >>>> I pursued the standard install of OSCAR. Because I've done this > >>>> once before on what should presumably be identical hardware, I > >>>> remembered > >>>> to: > >>>> > >>>> a) replace /usr/share/systemimager/boot/i386/standard/* with the > >>>> tarfile from Frank Crawford, who had given me the > >>>> boel_binaries.tar.gz, kernel, config, and initrd.img files that > >>>> will be used. This EXACT set of files enabled me to do > the imaging > >>>> on my cluster of PowerEdge 1750s in December of 2003. > >>>> > >>>> b) create an > >> /var/lib/systemimager/override/IMAGENAME/etc/modules.conf > >>>> containing the EXACT same file as that file on my > previous cluster > >>>> so that the machines would remember to load the drivers. > >>>> > >>>> However, when I start up network boot on the new server, and > >>>> netboot one of the new clients, it gets DHCP, receives > the correct > >>>> DHCP address, and then begins to load the imaging > kernel. However, > >>>> I get the following errors (I had to write them down, so > these are > >>>> just excerpts, albeit in > >> chronological order) > >>>> > >>>> tg3: (02:00.0) phy probe failed, err -16 > >>>> tg3: problem fetching invariants of chip, aborting > >>>> tg3: (02:00.1) phy probe failed, err -16 > >>>> tg3: problem fetching invariants of chip, aborting > >>>> > >>>> <stuff> > >>>> > >>>> SCSI subsystem driver Revision: 1.00 > >>>> kmod: failed to execv /sbin/modprobe -s -k > >> scsi_hostadapter, errno = 2 > >>>> > >>>> < stuff> > >>>> > >>>> FusionMPT base driver 2.03.00 > >>>> mptbase: Initiating ioc0 bringup > >>>> mptbase: ioc0: WARNING: unexpected doorbell active > >>>> mptbase: ioc0: ERROR: doorbell ACK timeout (2) > >>>> > >>>> <more stuff> > >>>> > >>>> VFS: Mounted root (cramfs filesystem) Mounted devfs on > /dev Freeing > >>>> unused kernel memory: 524k freed Unable to handle kernel NULL > >>>> pointer dereference at virtual address <> > >>>> EIP: 0060:<c0264257> > >>>> > >>>> < BUNCH OF NUMBERS> > >>>> > >>>> Kernel panic: attempted to stop init! > >>>> > >>>> And then it dies. > >>>> > >>>> This is pretty annoying. I had assumed that it would JUST WORK > >>>> given that the hardware, software, and operating system > (except for > >>>> the head node, which is a 2650) is (nominally?) > identical in both > >>>> cases. > >>>> > >>>> I did a little searching on the net for this "tg3: > problem fetching > >>>> invariants of chip, aborting" error, and turned > >> up this link, > >>>> > >>>> http://www.mail-archive.com/[EMAIL PROTECTED]/ > >>>> msg00705.html > >>>> > >>>> which has another source of these boel_binaries, etc that should > >>>> ALSO work. They do not work either for this new cluster I am > >>>> attempting to > >>>> install: they get a similar tg3 error, and then fail. > >>>> > >>>> What is going on here? I've read that some people have > been happy > >>>> with > >>>> tg3 and some with bc5700... I was perfectly happy with tg3 until > >>>> they don't seem to work for these *particular* Dell > >> 1750s. :-( > >>>> > >>>> And what about the kmod: failed to execv /sbin/modprobe -s -k > >>>> scsi_hostadapter, errno = 2 error? Does that suggest that it > >>>> hasn't correctly loaded the SCSI driver EITHER? > >>>> > >>>> Does anyone have any suggestions? What exactly is involved (as > >>>> much detail as possible would be appreciated) in trying > to make my > >>>> very own set of boel_binaries/kernel/initrd.img? > >>>> > >>>> Have I missed something really obvious? Can anybody suggest > >>>> something? > >>>> Everyone was so helpful getting it to work correctly the first > >>>> time that I thought I'd take another crack at the list. :-) > >>>> > >>>> Thanks a bunch, > >>>> > >>>> Jason > >>>> > >>>> -------------- > >>>> Jason Hlady, B. Sc., M. Sc. (Chem), Adv. Cert. (Comp. Sci.) > >>>> Programmer/Analyst (Bioinformatics Specialist) U of > Saskatchewan, > >>>> Bioinformatics Research Laboratory (BIRL) > [EMAIL PROTECTED] (306) > >>>> 966-2075 > >>>> > >>>> > >>>> > >>>> ------------------------------------------------------- > >>>> This SF.Net email is sponsored by: YOU BE THE JUDGE. Be > one of 170 > >>>> Project Admins to receive an Apple iPod Mini FREE for your > >>>> judgement on who ports your project to Linux PPC the best. > >>>> Sponsored by IBM. > >>>> Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > >>>> _______________________________________________ > >>>> Oscar-users mailing list > >>>> [EMAIL PROTECTED] > >>>> https://lists.sourceforge.net/lists/listinfo/oscar-users > >>>> > >>>> > >>>> > >> -------------- > >> Jason Hlady, B. Sc., M. Sc. (Chem), Adv. Cert. (Comp. Sci.) > >> Programmer/Analyst (Bioinformatics Specialist) U of Saskatchewan, > >> Bioinformatics Research Laboratory (BIRL) [EMAIL PROTECTED] (306) > >> 966-2075 > >> > >> > >> > >> > -------------- > Jason Hlady, B. Sc., M. Sc. (Chem), Adv. Cert. (Comp. Sci.) > Programmer/Analyst (Bioinformatics Specialist) U of > Saskatchewan, Bioinformatics Research Laboratory (BIRL) > [EMAIL PROTECTED] (306) 966-2075 > > > ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users