On Thu, 2006-03-02 at 14:31 +0100, Welterlen Benoit wrote:
> Eric W. Biederman wrote:
> > Welterlen Benoit <[EMAIL PROTECTED]> writes:
> >
> >
> > > Hello,
> > >
> > > I made some tests with Khalid Aziz's patches but I have problems with SCSI
> > > devices reset :
> > >
> > > I tried to run kexec on an itanium system with this device :
> > > 03:01.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m
> > > (rev 01)
> > > and this one :
> > > 03:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
> > > Fusion-MPT Dual Ultra320 SCSI (rev 08)
> > >
> > > I tried on a x86 system with this SCSI device :
> > > 02:04.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
> > > and eveything works well with a 2.6.15.4 kernel.
> > >
> > >
> > > On IA64, kexec can not reboot the second kernel.
> > > I tried with a 2.6.15 kernel and Khalid's patch, but I have the same
> > > result :
> > >
> >
> > So there is one simple work around you can try for the pure kexec case
> > remove the module before calling sys_reboot(LINUX_REBOOT_CMD_KEXEC)
> >
> > Beyond that it looks like the driver of one kernel puts the cards in
> > a state another kernel cannot get it out of. So I would send a bug
> > report to the driver maintainer.
> >
> >
> I tried to remove the module before kexec starts :
> -If I run a second kernel with an initrd, modules are not reloaded
> (udev does'nt see the device ?!), so / file system can not be found.
> You said in an other message that modules are not inserted in the
> second kernel ?! ("But they are not inserted into the second kernel.")
> But, if the kernel crashes in a module, you have to unload an reload
> the module to have a kernel running, that's right ? So I don't
> understand why modules are not reloaded.
>
> -If I run a built-in modules kernel, SCSI device is correctly found,
> but the kernel can not send any command to the device :
>
> Nova login: try to unload aic7xxx module !
> ----------------------------------------------
> ---------------------sd_remove------------------------
> disk sda:
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 4343
> lost page write due to I/O error on sda3
> Aborting journal on device sda3.
> journal commit I/O error
> ext3_abort called.
> EXT3-fs error (device sda3): ext3_journal_start_sb: <0>journal commit
> I/O error
> Detected aborted journal
> Remounting filesystem read-only
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 65539
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 65544
> lost page write due to I/O error on sda3
> Buffer I/O error on device sda3, logical block 65545
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 65549
> lost page write due to I/O error on sda3
> Buffer I/O error on device sda3, logical block 65550
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 65701
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 65702
> lost page write due to I/O error on sda3
> Buffer I/O error on device sda3, logical block 65703
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> Buffer I/O error on device sda3, logical block 131074
> lost page write due to I/O error on sda3
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to device being removed
> scsi0 (9:0): rejecting I/O to dead device
> EXT3-fs error (device sda3): ext3_find_entry: reading directory #33028
> offset 0
> Synchronizing SCSI cache for disk sdb:
> Starting new kernel
> scsi0 (9:0): rejecting I/O to dead device
> Starting new kernel
> scsi0 (9:0): rejecting I/O to dead device
> EXT3-fs error (device sda3): ext3_find_entry: reading directory
> #456994 offset 0
> Linux version 2.6.15.4 ([EMAIL PROTECTED]) (gcc version 3.4.4 20050721 (Red
> Hat 3.4.4-2)) #3 SMP Thu Mar 2 08:25:35 CET 2006
> ...
> scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
> <Adaptec 3960D Ultra160 SCSI adapter>
> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
>
> isa bounce pool size: 16 pages
> 0:0:0:0: Attempting to queue an ABORT message
> CDB: 0x12 0x0 0x0 0x0 0x24 0x0
> scsi0: At time of recovery, card was paused
> >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
> scsi0: Dumping Card State while idle, at SEQADDR 0x17e
> Card was paused
> ACCUM = 0x0, SINDEX = 0x48, DINDEX = 0xe4, ARG_2 = 0xf
> HCNT = 0x0 SCBPTR = 0x0
> ...
>
>
> Somebody else is using AIC7xxx device ?
Sorry, I couldn't get to this earlier. I was out of town with no access
to email. Is your root disk connected to the adaptec card? If not, make
sure you unmount any filesystems on all disks connected to adaptec card
and them rmmod aic7xxx. When you did not build aic7xxx driver in the
kernel and attempted a kexec reboot, did you see the kexec'd kernel
mounting initrd during bootup?
--
Khalid
====================================================================
Khalid Aziz Open Source and Linux Organization
(970)898-9214 Hewlett-Packard
[EMAIL PROTECTED] Fort Collins, CO
"The Linux kernel is subject to relentless development"
- Alessandro Rubini
_______________________________________________
fastboot mailing list
[email protected]
https://lists.osdl.org/mailman/listinfo/fastboot