Mon server @home a crash� avec ca dans kern.log. Ce sont les disques SCSI 
avec le LVM dessus qui en sont la cause, il semble. C'est comme s'ils se 
mettaient en veille sans jamais revenir � un IRQ. c'est la 2e fois que ca arrive en 2 
semaines. Il faut noter que le 5e device sur la chaine SCSI est un tape 
8MM que je power off apr�s le backup de la semaine (il a le terminateur actif). 

Si je le boot normalement apr�s un clean shutdown, 
lvm ne voit pas de VG au scan, je suis oblig� de restaurer le VGDA sur 
chaque HDD (/dev/sdaN1). Apr�s ca, un vgscan voit tout le VG et les lv 
sont actifs. Je peux alors travailler normalement sur les lv en reiserfs 
(resize, etc marchent bien):

vgscan
vgdisplay -- no volume groups found
vgcfgrestore -f /etc/lvmconf/doc_vg.conf -n doc_vg /dev/sda1
vgcfgrestore -- VGDA for "doc_vg" successfully restored to physical volume 
"/dev/sda1"
...
vgscan
vgchange -ay 
vgchange -- volume group "doc_vg" successfully activated
[EMAIL PROTECTED]:/LOG# vgdisplay
--- Volume group ---
VG Name               doc_vg
VG Access             read/write
VG Status             available/resizable
VG #                  0
MAX LV                256
Cur LV                2
Open LV               0
MAX LV Size           2 TB
Max PV                256
Cur PV                4
Act PV                4
VG Size               16.62 GB
PE Size               32 MB
Total PE              532
Alloc PE / Size       288 / 9 GB
Free  PE / Size       244 / 7.62 GB
VG UUID               gcyqUX-080P-l7px-arts-BEA3-v1UG-YH9ec2

[EMAIL PROTECTED]:/LOG# mount -a
[EMAIL PROTECTED]:/LOG# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda1              1269056   1151896    117160  91% /
/dev/hda2              2947828   1141336   1806492  39% /data
/dev/hdb2               814432     32896    781536   5% /data/ftp
/dev/doc_vg/lv_cd01    6291260   2599396   3691864  42% 
/data/www/documentation
/dev/doc_vg/lv_pg_data
                       3145628    327912   2817716  11% 
/var/lib/postgres/data

et l� OK.

C'est un premier probl�me mais non bloquant � r�soudre ASAP . Le crash de 
la semaine derni�re �tait lorsque j'ai ajout� des PP � un LV: un des 4 disques avait 
tous ses pp libres 
(sdc1) et lors du resize, crash/bang. Apr�s avoir demount� tout et retir� 
les modules scsi et lvm, j'ai stopp� ma tour SCSI et remis ON, refais un 
modprobe et tout remont� (apr�s le restore du VGDA malgr� tout) et il a 
alors pu agrandir le LV et le reiserfs, comme si le disque sdc �tait OK. 
Ce matin, rebelotte mais lors d'I/O sur fichiers.

Il y a 4 disques de 4Gb venant de pSeries en rade et un tape Exabyte 8mm.

Voil� le dump (sorry de la taille), si vous avez d�ja eu le cas... Je 
cherche de mon c�t� ce soir:
 
Oct 24 05:02:38 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
message
Oct 24 05:02:38 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
at SEQADDR 0x9d
Oct 24 05:02:38 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
ARG_2 = 0xff
Oct 24 05:02:38 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
Oct 24 05:02:38 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
Oct 24 05:02:38 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
Oct 24 05:02:38 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x44, SXFRCTL0 
= 0xa8
Oct 24 05:02:38 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
Oct 24 05:02:38 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
Oct 24 05:02:38 nabiki kernel: SCB count = 20
Oct 24 05:02:38 nabiki kernel: Kernel NEXTQSCB = 5
Oct 24 05:02:38 nabiki kernel: Card NEXTQSCB = 11
Oct 24 05:02:38 nabiki kernel: QINFIFO entries: 11
Oct 24 05:02:38 nabiki kernel: Waiting Queue entries:
Oct 24 05:02:38 nabiki kernel: Disconnected Queue entries:
Oct 24 05:02:38 nabiki kernel: QOUTFIFO entries:
Oct 24 05:02:38 nabiki kernel: Sequencer Free SCB List: 2 0
Oct 24 05:02:38 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
Oct 24 05:02:38 nabiki kernel: Pending list: 11(c 0x68, s 0x27, l 0), 0(c 
0x68, s 0x27, l 0)
Oct 24 05:02:38 nabiki kernel: Kernel Free SCB list: 14 2 9 13 4 3 1 19 7 
8 10 6 12 15 18 17 16
Oct 24 05:02:38 nabiki kernel: DevQ(0:0:0): 0 waiting
Oct 24 05:02:38 nabiki kernel: DevQ(0:2:0): 0 waiting
Oct 24 05:02:38 nabiki kernel: DevQ(0:3:0): 0 waiting
Oct 24 05:02:38 nabiki kernel: DevQ(0:5:0): 0 waiting
Oct 24 05:02:38 nabiki kernel: DevQ(0:6:0): 0 waiting
Oct 24 05:02:38 nabiki kernel: scsi0:0:2:0: Device is active, asserting 
ATN
Oct 24 05:02:38 nabiki kernel: Recovery code sleeping
Oct 24 05:02:38 nabiki kernel: Recovery code awake
Oct 24 05:02:38 nabiki kernel: aic7xxx_abort returns 0x2002
Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
message
Oct 24 05:02:48 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
at SEQADDR 0x9d
Oct 24 05:02:48 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
ARG_2 = 0xff
Oct 24 05:02:48 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
Oct 24 05:02:48 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
Oct 24 05:02:48 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
Oct 24 05:02:48 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x54, SXFRCTL0 
= 0xa8
Oct 24 05:02:48 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
Oct 24 05:02:48 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
Oct 24 05:02:48 nabiki kernel: SCB count = 20
Oct 24 05:02:48 nabiki kernel: Kernel NEXTQSCB = 14
Oct 24 05:02:48 nabiki kernel: Card NEXTQSCB = 11
Oct 24 05:02:48 nabiki kernel: QINFIFO entries: 11 5
Oct 24 05:02:48 nabiki kernel: Waiting Queue entries:
Oct 24 05:02:48 nabiki kernel: Disconnected Queue entries:
Oct 24 05:02:48 nabiki kernel: QOUTFIFO entries:
Oct 24 05:02:48 nabiki kernel: Sequencer Free SCB List: 2 0
Oct 24 05:02:48 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
Oct 24 05:02:48 nabiki kernel: Pending list: 5(c 0x68, s 0x27, l 0), 11(c 
0x68, s 0x27, l 0), 0(c 0x68, s 0x27, l 0)
Oct 24 05:02:48 nabiki kernel: Kernel Free SCB list: 2 9 13 4 3 1 19 7 8 
10 6 12 15 18 17 16
Oct 24 05:02:48 nabiki kernel: DevQ(0:0:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:2:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:3:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:5:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:6:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Cmd aborted from QINFIFO
Oct 24 05:02:48 nabiki kernel: aic7xxx_abort returns 0x2002
Oct 24 05:02:48 nabiki kernel: scsi0:0:2:0: Attempting to queue an ABORT 
message
Oct 24 05:02:48 nabiki kernel: scsi0: Dumping Card State in Data-in phase, 
at SEQADDR 0x9d
Oct 24 05:02:48 nabiki kernel: ACCUM = 0x0, SINDEX = 0x8, DINDEX = 0x8f, 
ARG_2 = 0xff
Oct 24 05:02:48 nabiki kernel: HCNT = 0x0 SCBPTR = 0x1
Oct 24 05:02:48 nabiki kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
Oct 24 05:02:48 nabiki kernel:  DFCNTRL = 0x0, DFSTATUS = 0x28
Oct 24 05:02:48 nabiki kernel: LASTPHASE = 0x40, SCSISIGI = 0x54, SXFRCTL0 
= 0xa8
Oct 24 05:02:48 nabiki kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
Oct 24 05:02:48 nabiki kernel: STACK == 0x9a, 0x19b, 0x15a, 0x0
Oct 24 05:02:48 nabiki kernel: SCB count = 20
Oct 24 05:02:48 nabiki kernel: Kernel NEXTQSCB = 11
Oct 24 05:02:48 nabiki kernel: Card NEXTQSCB = 14
Oct 24 05:02:48 nabiki kernel: QINFIFO entries: 14
Oct 24 05:02:48 nabiki kernel: Waiting Queue entries:
Oct 24 05:02:48 nabiki kernel: Disconnected Queue entries:
Oct 24 05:02:48 nabiki kernel: QOUTFIFO entries:
Oct 24 05:02:48 nabiki kernel: Sequencer Free SCB List: 2 0
Oct 24 05:02:48 nabiki kernel: Sequencer SCB Info: 0(c 0x68, s 0x27, l 0, 
t 0xff) 1(c 0x68, s 0x27, l 0, t 0x0) 2(c 0x68, s 0x27, l 0, t 0xff)
Oct 24 05:02:48 nabiki kernel: Pending list: 14(c 0x68, s 0x27, l 0), 0(c 
0x68, s 0x27, l 0)
Oct 24 05:02:48 nabiki kernel: Kernel Free SCB list: 5 2 9 13 4 3 1 19 7 8 
10 6 12 15 18 17 16
Oct 24 05:02:48 nabiki kernel: DevQ(0:0:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:2:0): 0 waiting
Oct 24 05:02:48 nabiki kernel: DevQ(0:3:0): 0 waiting
...
Oct 24 05:04:44 nabiki kernel: scsi0:0:2:0: Cmd aborted from QINFIFO
Oct 24 05:04:44 nabiki kernel: aic7xxx_abort returns 0x2002
Oct 24 05:04:44 nabiki kernel: scsi: device set offline - not ready or 
command retry failed after bus reset: host 0 channel 0 id 2 lun 0
Oct 24 05:04:44 nabiki kernel: SCSI disk error : host 0 channel 0 id 2 lun 
0 return code = 50000
Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 3949040
Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 3949048
Oct 24 05:04:44 nabiki kernel: SCSI disk error : host 0 channel 0 id 2 lun 
0 return code = 3f0000
Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 4012624
Oct 24 05:04:44 nabiki kernel:  I/O error: dev 08:11, sector 4012632
Oct 24 05:04:44 nabiki kernel: journal-601, buffer write failed
Oct 24 05:04:44 nabiki kernel: kernel BUG at prints.c:334!
Oct 24 05:04:44 nabiki kernel: invalid operand: 0000
Oct 24 05:04:44 nabiki kernel: CPU:    0
Oct 24 05:04:44 nabiki kernel: EIP:    
0010:[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2793383/96]    
Not tainted
Oct 24 05:04:44 nabiki kernel: EFLAGS: 00010282
Oct 24 05:04:44 nabiki kernel: eax: 00000024   ebx: d08a8340   ecx: 
00000001   edx: 00000001
Oct 24 05:04:44 nabiki kernel: esi: c50abc00   edi: c50abc00   ebp: 
0000000d   esp: c13a3ee0
Oct 24 05:04:44 nabiki kernel: ds: 0018   es: 0018   ss: 0018
Oct 24 05:04:44 nabiki kernel: Process kupdated (pid: 6, 
stackpage=c13a3000)
Oct 24 05:04:44 nabiki kernel: Stack: d08a67da d08aa420 d08a8340 c13a3f04 
d0d7ad88 00000000 d089f0be c50abc00
Oct 24 05:04:44 nabiki kernel:        d08a8340 00000025 00000012 00000010 
00000000 d0d7adbc d0d7adb0 0000000e
Oct 24 05:04:44 nabiki kernel:        00000000 c77432c0 d08a27be c50abc00 
d0d7ad88 00000001 c13a3f98 c50abc00
Oct 24 05:04:44 nabiki kernel: Call Trace:    
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2721830/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2706400/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2714816/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2752322/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2714816/96]
Oct 24 05:04:44 nabiki kernel:   
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2738242/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2741571/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2710961/96] 
[md:__insmod_md_O/lib/modules/2.4.20-k6/kernel/drivers/md/md.o_+-2803643/96] 
[sync_supers+222/288] [sync_old_buffers+14/68]
Oct 24 05:04:44 nabiki kernel:   [kupdate+217/252] [kernel_thread+40/56]
Oct 24 05:04:44 nabiki kernel:
Oct 24 05:04:44 nabiki kernel: Code: 0f 0b 4e 01 e0 67 8a d0 68 20 a4 8a 
d0 85 f6 74 16 0f b7 46





_______________________________________________________
Linux Mailing List - http://www.unixtech.be
Subscribe/Unsubscribe: http://www.unixtech.be/mailman/listinfo/linux
Archives: http://www.mail-archive.com/[EMAIL PROTECTED]
IRC: efnet.unixtech.be:6667 - #unixtech

Répondre à