Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Jorge L. deLyra

 I going nuts here.  In June, I set up a scheme to have an old PC boot from a
 local image (floppy and/or hd) and run of a nfs-root partition on my
 server. Works very well with 2.2.20, and I replicated it with 2.2.21.

 But for the life of me, I cannot get it to work with 2.4.19. Neither the
 stock kernel (from the sources .dep) nor with the open-mosix patch. I tried
 numerous config option, and all I get, invariably, is a kernel panic relating
 to init right after the local hd* devices are found.  At this point 2.2.*
 seems to find and use the eth0 device and the remote partitions.

 Any suggestions welcome. Please CC me as I'm not subscribed on -user or
 -beowulf.

How about posting in detail the kernel messages just before the crash?
What are the contents of that nfs-root (which distribuition)? Since you
get to the init message, I presume it is mounted OK by the kernel? What is
the set of boot line parameters you use for that floopy kernel?

Are you using DHCP to configure the network? Can you diff the kernel
configuration files for 2.2.20 and 2.4.19 to check the differences? One
funny thing I remember is that in the 2.4 kernels you must turn on the
dhcp option in the kernel-level autoconfiguration section even you are not
actually going to use it...
Cheers,


Jorge L. deLyra,  Associate Professor of Physics
The University of Sao Paulo,  IFUSP-DFMA
   For more information: finger [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Dirk Eddelbuettel

On Sat, Sep 21, 2002 at 02:32:10PM -0300, Jorge L. deLyra wrote:
  I going nuts here.  In June, I set up a scheme to have an old PC boot from a
  local image (floppy and/or hd) and run of a nfs-root partition on my
  server. Works very well with 2.2.20, and I replicated it with 2.2.21.
 
  But for the life of me, I cannot get it to work with 2.4.19. Neither the
  stock kernel (from the sources .dep) nor with the open-mosix patch. I tried
  numerous config option, and all I get, invariably, is a kernel panic relating
  to init right after the local hd* devices are found.  At this point 2.2.*
  seems to find and use the eth0 device and the remote partitions.
 
  Any suggestions welcome. Please CC me as I'm not subscribed on -user or
  -beowulf.
 
 How about posting in detail the kernel messages just before the crash?

Sure, I was just being lazy as I would have to type those manually, and it
is a screen full of them. Also, the server end does work (as 2.2.20 and
2.2.21 run just fine off it) and I am fairly certain that my problem is with
the 2.4.19 configuration.

 What are the contents of that nfs-root (which distribuition)? Since you

Debian testing, installing using debootstrap, and recently updated using a
simple chroot call to start a session on the server, then the usual apt-get
gymnastics. 

The nfs-root partition is down to 70-ish packages which is nice. All is does
is to the query a remote xdm (well, kdm) session on the server. So the
nfs-root partition acts as the hd for the 'thin client' pc which uses it to
bootstrap itself to be a remote x11 terminal.  That all works, and I would
like it to be an openmosix client too. Hence the need for 2.4.19.

 get to the init message, I presume it is mounted OK by the kernel? What is
 the set of boot line parameters you use for that floopy kernel?

Same for both 2.2.20/2.2.21 (which works) and 2.4.19 (which fails). I
currently use loadlin, and it passes the set of network boot parameters.

 Are you using DHCP to configure the network? Can you diff the kernel

No, static assignment as I have only one thin client so far...

 configuration files for 2.2.20 and 2.4.19 to check the differences? One

I avoided that so far as many other things have changed between the kernels.
But I guess I need to go there, given that I have the working setup...

 funny thing I remember is that in the 2.4 kernels you must turn on the
 dhcp option in the kernel-level autoconfiguration section even you are not
 actually going to use it...

Yes, I activate the lot:

CONFIG_IP_PNP=y
CONFIG_IP_PNP_ENABLE=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y

Thanks,  Dirk

-- 
Good judgement comes from experience; experience comes from bad judgement. 
-- Fred Brooks


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Jonathan D. Proulx


To point out the obvious...

are you sure you flipped the kernel-autoconfig bit in networking and

the allow nfsroot  bit (somewhere else I forget, probably network
file systems)

have the right NIC driver builtin to the boot kernel

I presume you're getting a can't find init message or a panic about
the root device.

One issue I had with 2.4 kernels is that I sometimes made them too
big.  I'm not sure too big for what as the one that works is bigger
than 640k and the ones that failed fit on the floppy with syslinux and
config stuff...

-Jon


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Dirk Eddelbuettel

On Sat, Sep 21, 2002 at 02:02:46PM -0400, Jonathan D. Proulx wrote:
 
 To point out the obvious...
 
 are you sure you flipped the kernel-autoconfig bit in networking and
 
 the allow nfsroot  bit (somewhere else I forget, probably network
 file systems)

Yes, I checked that:

CONFIG_IP_PNP=y this should be the kernel-autoconfig
CONFIG_IP_PNP_ENABLE=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
[...]
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_ROOT_NFS=y   this is the root-NFS part
CONFIG_NFSD=y
CONFIG_NFSD_V3=y

 have the right NIC driver builtin to the boot kernel

Yep. It is, IIRC, an old ISA card. I musr assume that 2.4.* does support
them as the options are still there.

CONFIG_NET_ETHERNET=y
CONFIG_RTL8139TOO=y this is I still need to install (100mbps)   
CONFIG_8139TOO_8129=y
CONFIG_NET_ISA=y
CONFIG_NE2000=y this one is installed, and works for 2.2.*
CONFIG_NET_EISA=y
CONFIG_EEXPRESS_PRO100=ythese two are used in other boxen here
CONFIG_NE2K_PCI=y

Those 8129/8139 variants shouldn't trample on each other, should they?

 I presume you're getting a can't find init message or a panic about
 the root device.

Something like that. Retyping from the screen next to me:

All well until including to

hda: ...
hdb: ...
Uniform CD-ROM driver Rev
Partition check:
 hda:
 
and the trouble starts on that last line with
  1Unable to handle kernel paging request at virtual address 22c476e8
  printing eip
  [ screenful of gooblygoo relating to the registers, stack and trace ]
  0Kernel panic: Attempted to kill init!

 One issue I had with 2.4 kernels is that I sometimes made them too
 big.  I'm not sure too big for what as the one that works is bigger
 than 640k and the ones that failed fit on the floppy with syslinux and
 config stuff...

Yes. The 2.4.* ones that fail are around 820 to 840kb, the 2.2.* ones that
work are around 620 to 695kb, depending on what other (unused) stuff I left
enabled.

Dirk

-- 
Good judgement comes from experience; experience comes from bad judgement. 
-- Fred Brooks


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Jorge L. deLyra

  I presume you're getting a can't find init message or a panic about
  the root device.

 Something like that. Retyping from the screen next to me:

 All well until including to

 hda: ...
 hdb: ...
 Uniform CD-ROM driver Rev
 Partition check:
  hda:

 and the trouble starts on that last line with
   1Unable to handle kernel paging request at virtual address 22c476e8
   printing eip
   [ screenful of gooblygoo relating to the registers, stack and trace ]
   0Kernel panic: Attempted to kill init!

This is a bit confusing. This is neither a fail-to-mount-root panic nor a
cant-find-init panic, it's an Ooops, a processing error within the kernel.
The partition check is before mounting the root and before init comes in.
Here is the sequence from my box, with 2.4.19:

-
...
Partition check:
 hda: hda1 hda2 hda3 hda4  hda5 hda6 
 hdb: hdb1
 hdc: hdc1
 hdd: hdd1 hdd2
yours crashes here, right?
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 16384)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem) readonly.
root mounted, init come in below
...
-

Looks like the problem is with setting up the network. The message looks
like a memory problem. Are you sure the kernel is guessing the correct
amount of RAM? In any case, here is a method we use here which will allow
you to cut-and-paste the kernel boot messages into an email message. It is
a bit complicated but usefull for a lot of things: use a serial console.

1) Enable the serial console on the kernel, boot with the parameter
   console=ttyS0,9600 (or some other speed that works for you).

2) Have available some other machine with X11 running. Run seyon in an X11
   session, attached to some serial port. Configure seyon for the correct
   speed, 1 bit, no parity, CR translations, etc. You can test this using
   it as a terminal in some working machine where you put a getty in one
   of the serial ports. You have to change /etc/inittab for this:

# Example of how to put a getty on a serial line (for a terminal)
T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100

3) Build a simple 3-wire crossed serial cable and interconnect the two
   ports. Boot the node, you will be able to see the whole kernel boot
   procedure within you X11 seyon window. Just cut and paste...

Note that your X11 session could be anywhere, not necessarily in the
console of the machine with the serial port connected to the node. You log
into the machine via the network and open the seyon window anywhere. Lilo
and Etherboot also can be configured to use the serial port. You can use
seyon as a terminal, reboot a node in the server root and look at the
whole boot process (except the node's hardware cycling, of course) from
the confort of your office.
Cheers,


Jorge L. deLyra,  Associate Professor of Physics
The University of Sao Paulo,  IFUSP-DFMA
   For more information: finger [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: nfs-root booting works for 2.2.* client, but not 2.4.19

2002-09-21 Thread Jorge L. deLyra

 I now crash right here. The last good lines are
   VFS: Mounted root (nfs filesystem)
   Freeing unused kernel memory: 76k freed
   Undable to handle kernel paging request at vritual address 14d58d54
 printing eip:
   [ gooblehoo ... ]

 Ans you're right -- that is exactly where 2.2.2* pass control to init. The
 next line is 'Activating swap'.  It starts to smell like a memory issue.

Yes, this is where diskless nodes with bad memory crash more often. Since
2.2 and 2.4 probably use memory in different ways, it might be that you
are just lucky to get through with 2.2. I guess a standalone memtest run
is in order. You can run it from floppy and it is a sure-fire thing...

Cheers,


Jorge L. deLyra,  Associate Professor of Physics
The University of Sao Paulo,  IFUSP-DFMA
   For more information: finger [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]