as tu ksymoops ? Backup to fichier log dans /var/log/ksymoops/`date`
et decode les tracebacks. Quel kernel est ce ? standard debian ? Si oui,
envoie un mail a [EMAIL PROTECTED] et demande lui si il sais se que
c'est. Probleme dans le module raid, mais est ce a cause de l'erreur
scsi qui n'est pas bien handelee ou est ce vraiment un probleme de LVM,
telle est la question. Perso, je ne connais vraiment pas bien le layer
FS, donc, ji ni sais nin trop.

On Mon, 2003-10-27 at 18:58, Vincent Jamart wrote:
> Mon server @home a crash� avec ca dans kern.log. Ce sont les disques SCSI 
> avec le LVM dessus qui en sont la cause, il semble. C'est comme s'ils se 
> mettaient en veille sans jamais revenir � un IRQ. c'est la 2e fois que ca arrive en 
> 2 
> semaines. Il faut noter que le 5e device sur la chaine SCSI est un tape 
> 8MM que je power off apr�s le backup de la semaine (il a le terminateur actif). 
> 
> Si je le boot normalement apr�s un clean shutdown, 
> lvm ne voit pas de VG au scan, je suis oblig� de restaurer le VGDA sur 
> chaque HDD (/dev/sdaN1). Apr�s ca, un vgscan voit tout le VG et les lv 
> sont actifs. Je peux alors travailler normalement sur les lv en reiserfs 
> (resize, etc marchent bien):
> 
> vgscan
> vgdisplay -- no volume groups found
> vgcfgrestore -f /etc/lvmconf/doc_vg.conf -n doc_vg /dev/sda1
> vgcfgrestore -- VGDA for "doc_vg" successfully restored to physical volume 
> "/dev/sda1"
> ...
> vgscan
> vgchange -ay 
> vgchange -- volume group "doc_vg" successfully activated
> [EMAIL PROTECTED]:/LOG# vgdisplay
> --- Volume group ---
> VG Name               doc_vg
> VG Access             read/write
> VG Status             available/resizable
> VG #                  0
> MAX LV                256
> Cur LV                2
> Open LV               0
> MAX LV Size           2 TB
> Max PV                256
> Cur PV                4
> Act PV                4
> VG Size               16.62 GB
> PE Size               32 MB
> Total PE              532
> Alloc PE / Size       288 / 9 GB
> Free  PE / Size       244 / 7.62 GB
> VG UUID               gcyqUX-080P-l7px-arts-BEA3-v1UG-YH9ec2
> 
> [EMAIL PROTECTED]:/LOG# mount -a
> [EMAIL PROTECTED]:/LOG# df
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/hda1              1269056   1151896    117160  91% /
> /dev/hda2              2947828   1141336   1806492  39% /data
> /dev/hdb2               814432     32896    781536   5% /data/ftp
> /dev/doc_vg/lv_cd01    6291260   2599396   3691864  42% 
> /data/www/documentation
> /dev/doc_vg/lv_pg_data
>                        3145628    327912   2817716  11% 
> /var/lib/postgres/data
> 
> et l� OK.
> 
> C'est un premier probl�me mais non bloquant � r�soudre ASAP . Le crash de 
> la semaine derni�re �tait lorsque j'ai ajout� des PP � un LV: un des 4 disques avait 
> tous ses pp libres 
> (sdc1) et lors du resize, crash/bang. Apr�s avoir demount� tout et retir� 
> les modules scsi et lvm, j'ai stopp� ma tour SCSI et remis ON, refais un 
> modprobe et tout remont� (apr�s le restore du VGDA malgr� tout) et il a 
> alors pu agrandir le LV et le reiserfs, comme si le disque sdc �tait OK. 
> Ce matin, rebelotte mais lors d'I/O sur fichiers.
> 
> Il y a 4 disques de 4Gb venant de pSeries en rade et un tape Exabyte 8mm.
> 
> Voil� le dump (sorry de la taille), si vous avez d�ja eu le cas... Je 
> cherche de mon c�t� ce soir:
>  
> Oct 24 05:02:38 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
> message
> Oct 24 05:02:38 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
> at SEQADDR 0x9d
> Oct 24 05:02:38 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
> ARG_2 = 0xff
> Oct 24 05:02:38 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
> Oct 24 05:02:38 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
> Oct 24 05:02:38 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
> Oct 24 05:02:38 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x44, SXFRCTL0 
> = 0xa8
> Oct 24 05:02:38 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
> Oct 24 05:02:38 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
> Oct 24 05:02:38 nabiki kernel: SCB count = 20
> Oct 24 05:02:38 nabiki kernel: Kernel NEXTQSCB = 5
> Oct 24 05:02:38 nabiki kernel: Card NEXTQSCB = 11
> Oct 24 05:02:38 nabiki kernel: QINFIFO entries: 11
> Oct 24 05:02:38 nabiki kernel: Waiting Queue entries:
> Oct 24 05:02:38 nabiki kernel: Disconnected Queue entries:
> Oct 24 05:02:38 nabiki kernel: QOUTFIFO entries:
> Oct 24 05:02:38 nabiki kernel: Sequencer Free SCB List: 2 0
> Oct 24 05:02:38 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
> t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
> Oct 24 05:02:38 nabiki kernel: Pending list: 11(c 0x68, s 0x27, l 0), 0(c 
> 0x68, s 0x27, l 0)
> Oct 24 05:02:38 nabiki kernel: Kernel Free SCB list: 14 2 9 13 4 3 1 19 7 
> 8 10 6 12 15 18 17 16
> Oct 24 05:02:38 nabiki kernel: DevQ(0:0:0): 0 waiting
> Oct 24 05:02:38 nabiki kernel: DevQ(0:2:0): 0 waiting
> Oct 24 05:02:38 nabiki kernel: DevQ(0:3:0): 0 waiting
> Oct 24 05:02:38 nabiki kernel: DevQ(0:5:0): 0 waiting
> Oct 24 05:02:38 nabiki kernel: DevQ(0:6:0): 0 waiting
> Oct 24 05:02:38 nabiki kernel: scsi0:0:2:0: Device is active, asserting 
> ATN
> Oct 24 05:02:38 nabiki kernel: Recovery code sleeping
> Oct 24 05:02:38 nabiki kernel: Recovery code awake
> Oct 24 05:02:38 nabiki kernel: aic7xxx_abort returns 0x2002
> Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
> message
> Oct 24 05:02:48 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
> at SEQADDR 0x9d
> Oct 24 05:02:48 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
> ARG_2 = 0xff
> Oct 24 05:02:48 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
> Oct 24 05:02:48 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
> Oct 24 05:02:48 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
> Oct 24 05:02:48 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x54, SXFRCTL0 
> = 0xa8
> Oct 24 05:02:48 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
> Oct 24 05:02:48 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
> Oct 24 05:02:48 nabiki kernel: SCB count = 20
> Oct 24 05:02:48 nabiki kernel: Kernel NEXTQSCB = 14
> Oct 24 05:02:48 nabiki kernel: Card NEXTQSCB = 11
> Oct 24 05:02:48 nabiki kernel: QINFIFO entries: 11 5
> Oct 24 05:02:48 nabiki kernel: Waiting Queue entries:
> Oct 24 05:02:48 nabiki kernel: Disconnected Queue entries:
> Oct 24 05:02:48 nabiki kernel: QOUTFIFO entries:
> Oct 24 05:02:48 nabiki kernel: Sequencer Free SCB List: 2 0
> Oct 24 05:02:48 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
> t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
> Oct 24 05:02:48 nabiki kernel: Pending list: 5(c 0x68, s 0x27, l 0), 11(c 
> 0x68, s 0x27, l 0), 0(c 0x68, s 0x27, l 0)
> Oct 24 05:02:48 nabiki kernel: Kernel Free SCB list: 2 9 13 4 3 1 19 7 8 
> 10 6 12 15 18 17 16
> Oct 24 05:02:48 nabiki kernel: DevQ(0:0:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:2:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:3:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:5:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:6:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Cmd aborted from QINFIFO
> Oct 24 05:02:48 nabiki kernel: aic7xxx_abort returns 0x2002
> Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
> message
> Oct 24 05:02:48 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
> at SEQADDR 0x9d
> Oct 24 05:02:48 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
> ARG_2 = 0xff
> Oct 24 05:02:48 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
> Oct 24 05:02:48 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
> Oct 24 05:02:48 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
> Oct 24 05:02:48 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x54, SXFRCTL0 
> = 0xa8
> Oct 24 05:02:48 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
> Oct 24 05:02:48 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
> Oct 24 05:02:48 nabiki kernel: SCB count = 20
> Oct 24 05:02:48 nabiki kernel: Kernel NEXTQSCB = 11
> Oct 24 05:02:48 nabiki kernel: Card NEXTQSCB = 14
> Oct 24 05:02:48 nabiki kernel: QINFIFO entries: 14
> Oct 24 05:02:48 nabiki kernel: Waiting Queue entries:
> Oct 24 05:02:48 nabiki kernel: Disconnected Queue entries:
> Oct 24 05:02:48 nabiki kernel: QOUTFIFO entries:
> Oct 24 05:02:48 nabiki kernel: Sequencer Free SCB List: 2 0
> Oct 24 05:02:48 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
> t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
> Oct 24 05:02:48 nabiki kernel: Pending list: 14(c 0x68, s 0x27, l 0), 0(c 
> 0x68, s 0x27, l 0)
> Oct 24 05:02:48 nabiki kernel: Kernel Free SCB list: 5 2 9 13 4 3 1 19 7 8 
> 10 6 12 15 18 17 16
> Oct 24 05:02:48 nabiki kernel: DevQ(0:0:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:2:0): 0 waiting
> Oct 24 05:02:48 nabiki kernel: DevQ(0:3:0): 0 waiting
> ...
> Oct 24 05:04:44 nabiki kernel: scsi0:0:2:0: Cmd aborted from QINFIFO
> Oct 24 05:04:44 nabiki kernel: aic7xxx_abort returns 0x2002
> Oct 24 05:04:44 nabiki kernel: scsi: device set offline - not ready or 
> command retry failed after bus reset: host 0 channel 0 id 2 lun 0
> Oct 24 05:04:44 nabiki kernel: SCSI disk error : host 0 channel 0 id 2 lun 
> 0 return code = 50000
> Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 3949040
> Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 3949048
> Oct 24 05:04:44 nabiki kernel: SCSI disk error : host 0 channel 0 id 2 lun 
> 0 return code = 3f0000
> Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 4012624
> Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 4012632
> Oct 24 05:04:44 nabiki kernel: journal-601, buffer write failed
> Oct 24 05:04:44 nabiki kernel: kernel BUG at prints.c:334!
> Oct 24 05:04:44 nabiki kernel: invalid operand: 0000
> Oct 24 05:04:44 nabiki kernel: CPU:    0
> Oct 24 05:04:44 nabiki kernel: EIP:    
> 0010:[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2793383/96]    
> Not tainted
> Oct 24 05:04:44 nabiki kernel: EFLAGS: 00010282
> Oct 24 05:04:44 nabiki kernel: eax: 00000024   ebx: d08a8340   ecx: 
> 00000001   edx: 00000001
> Oct 24 05:04:44 nabiki kernel: esi: c50abc00   edi: c50abc00   ebp: 
> 0000000d   esp: c13a3ee0
> Oct 24 05:04:44 nabiki kernel: ds: 0018   es: 0018   ss: 0018
> Oct 24 05:04:44 nabiki kernel: Process kupdated (pid: 6, 
> stackpage=c13a3000)
> Oct 24 05:04:44 nabiki kernel: Stack: d08a67da d08aa420 d08a8340 c13a3f04 
> d0d7ad88 00000000 d089f0be c50abc00
> Oct 24 05:04:44 nabiki kernel:        d08a8340 00000025 00000012 00000010 
> 00000000 d0d7adbc d0d7adb0 0000000e
> Oct 24 05:04:44 nabiki kernel:        00000000 c77432c0 d08a27be c50abc00 
> d0d7ad88 00000001 c13a3f98 c50abc00
> Oct 24 05:04:44 nabiki kernel: Call Trace:    
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2721830/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2706400/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2714816/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2752322/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2714816/96]
> Oct 24 05:04:44 nabiki kernel:   
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2738242/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2741571/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2710961/96] 
> [md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2803643/96] 
> [sync_supers+222/288] [sync_old_buffers+14/68]
> Oct 24 05:04:44 nabiki kernel:   [kupdate+217/252] [kernel_thread+40/56]
> Oct 24 05:04:44 nabiki kernel:
> Oct 24 05:04:44 nabiki kernel: Code: 0f 0b 4e 01 e0 67 8a d0 68 20 a4 8a 
> d0 85 f6 74 16 0f b7 46
> 
> 
> 
> 
> 
> _______________________________________________________
> Linux Mailing List - http://www.unixtech.be
> Subscribe/Unsubscribe: http://www.unixtech.be/mailman/listinfo/linux
> Archives: http://www.mail-archive.com/[EMAIL PROTECTED]
> IRC: efnet.unixtech.be:6667 - #unixtech
-- 

-> Jean-Francois Dive
--> [EMAIL PROTECTED]

I think that God in creating Man somewhat overestimated his ability.
-- Oscar Wilde

_______________________________________________________
Linux Mailing List - http://www.unixtech.be
Subscribe/Unsubscribe: http://www.unixtech.be/mailman/listinfo/linux
Archives: http://www.mail-archive.com/[EMAIL PROTECTED]
IRC: efnet.unixtech.be:6667 - #unixtech

Répondre à