Bugs item #1068036, was opened at 2004-11-17 06:19
Message generated for change (Comment added) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1068036&group_id=9368
Category: Installation
Group: 4.0
Status: Open
Resolution: None
Priority: 9
Submitted By: Thomas Naughton (naughtont)
Assigned to: Bernard Li (bernardli)
Summary: Fedora (FC2) - scsi HD bug
Initial Comment:
When testing with oscar-4.0 material on Fedora Core 2
(x86) with a SCSI HD I had problems after the node
reboots.
The node seems to build properly (step6) but on the
next reboot after the SI install you get a Kernel PANIC
can't find root (sda6). The sda6 partition exists and
appears to have the correct data, ie. it is mountable
by hand from a rescue disk.
I/we tried playing with a few Grub options to no avail.
Bernard had similar issues (kernel panic can't mount
root) on some of his systems and it seemed to be due to
missing entries (apparently un-needed) in modules.conf
(2.4 kernel) and therefore were missing in the new
modprobe.conf (2.6 kernel). I do not have the test
system available to confirm this portion for our issue.
Note, the headnode had FC2/SCSI but was built by hand
from CD's and worked just fine.
REPRODUCE PROBLEM:
+ use oscar-4.0beta tarball on Fedora Core 2 system
with SCSI harddrives.
+ After building nodes, upon reboot, node(s) should
have kernel panic
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-12-02 14:31
Message:
Logged In: NO
Incase anyone want's the output from the kernel boot, I
finally got a console hooked up. Now I am really confused as
it is seeing the sata drives as sda and sdb, so my fstab
with /dev/sda should work:
Thanks,
Jason
[EMAIL PROTECTED]
Here is my output:
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
CPU0: Intel(R) Xeon(TM) CPU 3.20GHz stepping 04
per-CPU timeslice cutoff: 2924.82 usecs.
task migration cache decay timeout: 3 msecs.
masked ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Booting processor 1/1 eip 2000
CPU 1 irqstacks, hard=023ec000 soft=023cc000
Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 6389.76 BogoMIPS
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Xeon(TM) CPU 3.20GHz stepping 04
Booting processor 2/6 eip 2000
CPU 2 irqstacks, hard=023ed000 soft=023cd000
Initializing CPU#2
masked ExtINT on CPU#2
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 6389.76 BogoMIPS
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 3
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#2.
CPU2: Intel P4/Xeon Extended MCE MSRs (24) available
CPU2: Thermal monitoring enabled
CPU2: Intel(R) Xeon(TM) CPU 3.20GHz stepping 04
Booting processor 3/7 eip 2000
CPU 3 irqstacks, hard=023ee000 soft=023ce000
Initializing CPU#3
masked ExtINT on CPU#3
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 6389.76 BogoMIPS
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 3
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#3.
CPU3: Intel P4/Xeon Extended MCE MSRs (24) available
CPU3: Thermal monitoring enabled
CPU3: Intel(R) Xeon(TM) CPU 3.20GHz stepping 04
Total of 4 processors activated (25460.73 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 3199.0652 MHz.
..... host bus clock speed is 199.0978 MHz.
checking TSC synchronization across 4 CPUs: passed.
Brought up 4 CPUs
zapping low mappings.
checking if image is initramfs...it isn't (no cpio magic); looks
like an initrd
Freeing initrd memory: 777k freed
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfd75e, last bus=7
PCI: Using MMCONFIG
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040326
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.2
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Link [LP00] (IRQs *11)
ACPI: PCI Interrupt Link [LP01] (IRQs *3)
ACPI: PCI Interrupt Link [LP02] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP03] (IRQs *3)
ACPI: PCI Interrupt Link [LP04] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP05] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP06] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP07] (IRQs *5)
Linux Plug and Play Support v0.97 (c) Adam Belay
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:00:06.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:00:07.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) ->
IRQ 177
ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) ->
IRQ 185
ACPI: PCI interrupt 0000:00:1f.2[A]: no GSI
ACPI: PCI interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) ->
IRQ 193
ACPI: PCI interrupt 0000:06:00.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:07:00.0[A] -> GSI 16 (level, low) ->
IRQ 169
ACPI: PCI interrupt 0000:01:01.0[A] -> GSI 16 (level, low) ->
IRQ 169
testing the IO APIC.......................
Using vector-based indexing
.................................... done.
PCI: Cannot allocate resource region 1 of device 0000:00:00.0
vesafb: probe of vesafb0 failed with error -6
IBM machine detected. Enabling interrupts during APM calls.
apm: BIOS not found.
audit: initializing netlink socket (disabled)
audit(1102008222.359:0): initialized
highmem bounce pool size: 64 pages
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux: Registering netfilter hooks
Initializing Cryptographic API
ksign: Installing public key data
Loading keyring
- Added public key 9488DB81FF525AA3
- User ID: Red Hat, Inc. (Kernel Module GPG key)
ksign: invalid packet (ctb=00)
Unable to load default keyring: error=74
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: Processor [CPU3] (supports C1)
ACPI: Processor [CPU2] (supports C1)
ACPI: Processor [CPU1] (supports C1)
ACPI: Processor [CPU0] (supports C1)
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Real Time Clock Driver v1.12
Linux agpgart interface v0.100 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ
sharing enabled
�ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024
blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes;
override with idebus=xx
hda: HL-DT-STDVD-ROM GDR8083N, ATAPI CD/DVD-ROM drive
ide1: I/O resource 0x170-0x177 not free.
ide1: ports already in use, skipping probe
Using cfq io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 24X DVD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
mice: PS/2 mouse device common for all mice
serio: i8042 AUX port at 0x60,0x64 irq 12
input: PS/2 Logitech Mouse on isa0060/serio1
serio: i8042 KBD port at 0x60,0x64 irq 1
input: AT Translated Set 2 keyboard on isa0060/serio0
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
NET: Registered protocol family 2
IP: routing cache hash table of 32768 buckets, 512Kbytes
TCP: Hash tables configured (established 262144 bind 43690)
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S4 S5)
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem).
Red Hat nash version 3.5.22 starting
Mounted /proc filesystem
SCSI subsystem initialized
Mounting sysfs
Loading scsi_mod.ko module
Loading sd_mod.ko module
Loading libata.ko module
Loading ata_piix.ko module
ata_piix: combined mode detected
ACPI: PCI interrupt 0000:00:1f.2[A]: no GSI
ata: 0x1f0 IDE port busy
ata1: SATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma
0x488 irq 15
ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 1 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/133
ata1: dev 1 configured for UDMA/133
scsi0 : ata_piix
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 488397168 512-byte hdwr sectors (250059
MB)
SCSI device sda: drive cache: write through
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 488397168 512-byte hdwr sectors (250059
MB)
SCSI device sdb: drive cache: write through
sdb: sdb1
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
Loading raid1.ko module
md: raid1 personality registered as nr 3
Loading jbd.ko module
Loading ext3.ko module
Loading dm-mod.ko module
device-mapper: 4.1.0-ioctl (2003-12-10) initialised:
[EMAIL PROTECTED]
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
Creating block devices
Making device-mapper control node
Scanning logical volumes
Reading all physical volumes. This may take a while...
No volume groups found
ERROR: /bin/lvm exited abnormally!
Activating logical volumes
No volume groups found
ERROR: /bin/lvm exited abnormally!
Making device nodes
No volume groups found
ERROR: /bin/lvm exited abnormally!
Mounting root filesystem
mount: error 2 mounting ext3
pivotroot: pivot_root(/sysroot,/sysroot/initrd) failed: 2
umount /initrd/proc failed: 2
Freeing unused kernel memory: 176k freed
Kernel panic: No init found. Try passing init= option to kernel.
----------------------------------------------------------------------
Comment By: Fernando Laudares Camargos (laudares)
Date: 2004-12-02 13:56
Message:
Logged In: YES
user_id=931808
Yes, a SCSI Maxtor ATLAS10k4
(2.4.21-15.EL)
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-12-02 13:32
Message:
Logged In: NO
We have tried this on a head node that is identicall to our
client node, and still have the problem.
I think it is becuase systemimager or more specfically 2.4.25
uses /dev/hdX (we have serial-ata disks) and 2.6
uses /dev/sdX for the serial drives. When you say you have
scsi drive do you have real scsi or serial ata?
I updated my initrd image from a working system, and then
the kernel was able to see root, but I recieve a panic when
the kernel tries to mount / to load init. I have updated
grub.conf, and /etc/fstab as well as the initrd. I have no idea
why the system won't boot after I have updated the files.
The kernel complians about an ext3 error when trying to
mount root, I upgraded my partitions from ext2 to ext3 and I
still get the same error which now results in the kernel not
finding init. One more step to get through. Oh yes, I also
update modprobe.conf
Jason
[EMAIL PROTECTED]
----------------------------------------------------------------------
Comment By: Fernando Laudares Camargos (laudares)
Date: 2004-12-02 13:30
Message:
Logged In: YES
user_id=931808
We had this same problem with RHEL 3 (ia64) and that was due
to the fact that the system was recognizing our cd-rom as
the hd... To correct this we had to modify the DISKORDER
from the oscarimage.master script
(/var/lib/systemimager/scripts)
from: echo DISKORDER=${DISKORDER=hd,sd,cciss,ida,rd}
to: echo DISKORDER=${DISKORDER=sd,hd,cciss,ida,rd}
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-12-02 13:22
Message:
Logged In: NO
Our head node is the same as our client nodes, and we have
this problem. We are using serial ata actually. I think that
the systemimager (2.4.25) uses the device as /dev/hdX and
the newer kernels use the device as /dev/sdX....
After updating the initrd on our client image, with one that
supports scsi, the kernel is able to find root, but it then can't
mount /root to find init. I updated the /etc/fstab on the
client but this did not seem to help. Can I build an image
from a working system with oscar? ( I know how to do this
with system imager, but oscar does not give me the option to
select that image when defining clients.)
Jason
[EMAIL PROTECTED]
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-12-02 05:08
Message:
Logged In: NO
I think this is related to the same errors seen under the pvfs
bug. I had to copy over a working initrd to my systemimager
image directory in order to get the image to boot. It looks
like something is haveing a problem building the initrd
properly. The initrd that was built by default did not have the
scsi modules in it.
Jason
[EMAIL PROTECTED]
----------------------------------------------------------------------
Comment By: Bernard Li (bernardli)
Date: 2004-11-30 11:13
Message:
Logged In: YES
user_id=879102
- code checked into SVN
- release notes will be updated
- SVN r2764
----------------------------------------------------------------------
Comment By: Bernard Li (bernardli)
Date: 2004-11-26 14:15
Message:
Logged In: YES
user_id=879102
- I think I have working solution which is to run
systemconfigurator after /etc/modprobe.conf is generated
- I will perform some more testing before commiting it
- The user will still be able to copy over a working
modprobe.conf if they choose to
----------------------------------------------------------------------
Comment By: Bernard Li (bernardli)
Date: 2004-11-23 15:18
Message:
Logged In: YES
user_id=879102
- the fix is, if the headnode has SCSI HD, simply
copy /etc/modprobe.conf
to /var/lib/systemimager/images/oscarimage/etc (on the
headnode) and delete the
script /var/lib/systemimager/scripts/post-
install/22all.generate_modprobe_script
- this worked on a cluster that I tested on during SC04
- I will look into an alternative solution which is to hardcode
the necessary SCSI module(s) to /etc/modprobe.conf after
it's generated from /etc/modules.conf - this may work as I
have noticed that the necessary module is 'generic'
- this setup should work for both SCSI and IDE drives
- ultimately I believe this has to be fixed for SIS...
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-11-22 12:37
Message:
Logged In: NO
Bernard please explain your fix, then we'll determine if
this is a 4.0 item (ie. before moving to level=9).
----------------------------------------------------------------------
Comment By: Ed Hill (edhill)
Date: 2004-11-17 08:08
Message:
Logged In: YES
user_id=79627
We also had this problem and it was because the SCSI drivers
we needed were missing from the initrd. And I *think* this
happened because our compute nodes are SCSI while the head
node is IDE.
The ways to fix it are to either build a custom kernel that
has the needed SCSI drivers compiled in or to make sure that
OSCAR has an initrd that has the necessary SCSI drivers.
So how can we specify a spicific initrd for a node or a set
of nodes (say, if some nodes are SCSI and some aren't)?
Ed <[EMAIL PROTECTED]>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1068036&group_id=9368
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://productguide.itmanagersjournal.com/
_______________________________________________
Oscar-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-devel