Re: PROBLEM: raid5 hangs
On Wed, 14 Nov 2007, Peter Magnusson wrote: On Wed, 14 Nov 2007, Justin Piszcz wrote: This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 bio* patches are applied. Ok, good to know. Do you know when it first appeared because it existed in linux-2.6.22.3 also...? I am unsure, I and others started noticing it in 2.6.23 mainly; again, not sure, will let others answer this one. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PROBLEM: raid5 hangs
Justin Piszcz wrote: This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 bio* patches are applied. Note below he's running 2.6.22.3 which doesn't have the bug unless -STABLE added it. So should not really be in 2.6.22.anything. I assume you're talking the endless write or bio issue? Justin. On Wed, 14 Nov 2007, Peter Magnusson wrote: Hey. [1.] One line summary of the problem: raid5 hangs and use 100% cpu [2.] Full description of the problem/report: I have used 2.6.18 for 284 days or something until my powersupply died, no problem what so ever duing that time. After that forced reboot I did these changes; Put in 2 GB more memory so I have 3 GB instead of 1 GB, two disks in the raid5 got badblocks so I didnt trust them anymore so I bought new disks (I managed to save the raid5). I have 6x300 GB in a raid5. Two of them are now 320 GB so created a small raid1 also. That raid5 is encrypted with aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256. I compiled linux-2.6.22.3 and started to use that. I used the same .config as in default FC5, I think i just selected P4 cpu and preemptive kernel type. After 11 or 12 days the computer froze, I wasnt home when it happend and couldnt fix it for like 3 days. It was just to reboot it as it wasnt possible to login remotely or on console. It did respond to ping however. After reboot it rebuilded the raid5. Then it happend again after approx the same time, 11 or 12 days. I noticed that the process md1_raid5 used 100% cpu all the time. After reboot it rebuilded the raid5. I compiled linux-2.6.23. And then... it happend again... After about the same time as before. md1_raid5 used 100% cpu. I also noticed that I wasnt able to save anything in my homedir, it froze during save. I could read from it however. My homedir isnt on raid5 but its encrypted. Its not on any disk that has to do with raid. This problem didnt happend when I used 2.6.18. Currently I use 2.6.18 as I kinda need the computer stable. After reboot it rebuilded the raid5. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PROBLEM: raid5 hangs
On Wed, 14 Nov 2007, Bill Davidsen wrote: Justin Piszcz wrote: This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 bio* patches are applied. Note below he's running 2.6.22.3 which doesn't have the bug unless -STABLE added it. So should not really be in 2.6.22.anything. I assume you're talking the endless write or bio issue? The bio issue is the root cause of the bug yes? -- I am uncertain but I remember this happening in the past but I thought it was something I was doing (possibly 2.6.23) so it may have been happenign earlier than that but I am not positive. Justin. On Wed, 14 Nov 2007, Peter Magnusson wrote: Hey. [1.] One line summary of the problem: raid5 hangs and use 100% cpu [2.] Full description of the problem/report: I have used 2.6.18 for 284 days or something until my powersupply died, no problem what so ever duing that time. After that forced reboot I did these changes; Put in 2 GB more memory so I have 3 GB instead of 1 GB, two disks in the raid5 got badblocks so I didnt trust them anymore so I bought new disks (I managed to save the raid5). I have 6x300 GB in a raid5. Two of them are now 320 GB so created a small raid1 also. That raid5 is encrypted with aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256. I compiled linux-2.6.22.3 and started to use that. I used the same .config as in default FC5, I think i just selected P4 cpu and preemptive kernel type. After 11 or 12 days the computer froze, I wasnt home when it happend and couldnt fix it for like 3 days. It was just to reboot it as it wasnt possible to login remotely or on console. It did respond to ping however. After reboot it rebuilded the raid5. Then it happend again after approx the same time, 11 or 12 days. I noticed that the process md1_raid5 used 100% cpu all the time. After reboot it rebuilded the raid5. I compiled linux-2.6.23. And then... it happend again... After about the same time as before. md1_raid5 used 100% cpu. I also noticed that I wasnt able to save anything in my homedir, it froze during save. I could read from it however. My homedir isnt on raid5 but its encrypted. Its not on any disk that has to do with raid. This problem didnt happend when I used 2.6.18. Currently I use 2.6.18 as I kinda need the computer stable. After reboot it rebuilded the raid5. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PROBLEM: raid5 hangs
On Nov 14, 2007 5:05 PM, Justin Piszcz [EMAIL PROTECTED] wrote: On Wed, 14 Nov 2007, Bill Davidsen wrote: Justin Piszcz wrote: This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 bio* patches are applied. Note below he's running 2.6.22.3 which doesn't have the bug unless -STABLE added it. So should not really be in 2.6.22.anything. I assume you're talking the endless write or bio issue? The bio issue is the root cause of the bug yes? Not if this is a 2.6.22 issue. Neither of the bugs fixed by raid5: fix clearing of biofill operations or raid5: fix unending write sequence existed prior to 2.6.23. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
PROBLEM: raid5 hangs
Hey. [1.] One line summary of the problem: raid5 hangs and use 100% cpu [2.] Full description of the problem/report: I have used 2.6.18 for 284 days or something until my powersupply died, no problem what so ever duing that time. After that forced reboot I did these changes; Put in 2 GB more memory so I have 3 GB instead of 1 GB, two disks in the raid5 got badblocks so I didnt trust them anymore so I bought new disks (I managed to save the raid5). I have 6x300 GB in a raid5. Two of them are now 320 GB so created a small raid1 also. That raid5 is encrypted with aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256. I compiled linux-2.6.22.3 and started to use that. I used the same .config as in default FC5, I think i just selected P4 cpu and preemptive kernel type. After 11 or 12 days the computer froze, I wasnt home when it happend and couldnt fix it for like 3 days. It was just to reboot it as it wasnt possible to login remotely or on console. It did respond to ping however. After reboot it rebuilded the raid5. Then it happend again after approx the same time, 11 or 12 days. I noticed that the process md1_raid5 used 100% cpu all the time. After reboot it rebuilded the raid5. I compiled linux-2.6.23. And then... it happend again... After about the same time as before. md1_raid5 used 100% cpu. I also noticed that I wasnt able to save anything in my homedir, it froze during save. I could read from it however. My homedir isnt on raid5 but its encrypted. Its not on any disk that has to do with raid. This problem didnt happend when I used 2.6.18. Currently I use 2.6.18 as I kinda need the computer stable. After reboot it rebuilded the raid5. top looked like this: - 02:37:32 up 11 days, 2:00, 29 users, load average: 21.06, 17.45, 9.38 Tasks: 284 total, 2 running, 282 sleeping, 0 stopped, 0 zombie Cpu(s): 2.1%us, 51.2%sy, 0.0%ni, 0.0%id, 46.6%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3114928k total, 2981720k used, 133208k free, 8244k buffers Swap: 2096472k total, 252k used, 2096220k free, 1690196k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2147 root 15 -5 000 R 100 0.0 80:25.80 md1_raid5 11328 iocc 20 0 536m 374m 28m S3 12.3 249:32.38 firefox-bin After some time, just before I rebooted I had this load: 02:48:36 up 11 days, 2:11, 29 users, load average: 86.10, 70.80, 40.07 [3.] Keywords (i.e., modules, networking, kernel): raid5, possible dm_mod [4.] Kernel version (from /proc/version): Not using 2.6.23 now but anyway... Linux version 2.6.18 ([EMAIL PROTECTED]) (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 SMP Sun Sep 24 12:58:16 CEST 2006 [5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) No oopses, doesnt log anything. [6.] A small shell script or example program which triggers the problem (if possible) - [7.] Environment Hmm.. FilesystemSize Used Avail Use% Mounted on /dev/sda1 7.8G 7.0G 761M 91% /- unencrypted fs tmpfs 1.5G 0 1.5G 0% /dev/shm /dev/mapper/home 24G 23G 1.6G 94% /home- encrypted fs /dev/mapper/temp 1.4T 822G 555G 60% /temp- encrypted fs,raid5 /dev/mapper/jb 18G 17G 1.2G 94% /mnt/jb - encrypted fs,raid1 [EMAIL PROTECTED] linux-2.6.23]# cryptsetup status home /dev/mapper/home is active: cipher: aes-cbc-plain keysize: 256 bits device: /dev/sda3 offset: 0 sectors size:50861790 sectors mode:read/write [EMAIL PROTECTED] linux-2.6.23]# cryptsetup status temp /dev/mapper/temp is active: cipher: aes-cbc-plain keysize: 256 bits device: /dev/md1 offset: 0 sectors size:2930496000 sectors mode:read/write [EMAIL PROTECTED] linux-2.6.23]# cryptsetup status jb /dev/mapper/jb is active: cipher: aes-cbc-essiv:sha256 keysize: 256 bits device: /dev/md0 offset: 0 sectors size:37238528 sectors mode:read/write [7.1.] Software (add the output of the ver_linux script here) If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux flashdance.cx 2.6.18 #1 SMP Sun Sep 24 12:58:16 CEST 2006 i686 i686 i386 GNU/Linux Gnu C 4.1.1 Gnu make 3.80 binutils 2.16.91.0.6 util-linux 2.13-pre7 mount 2.13-pre7 module-init-tools 3.2.2 e2fsprogs 1.38 reiserfsprogs 3.6.19 quota-tools3.13. PPP2.4.3 Linux C Library2.4 Dynamic linker (ldd) 2.4 Procps 3.2.7 Net-tools 1.60 Kbd1.12 oprofile 0.9.1 Sh-utils 5.97 udev 084 wireless-tools 28 Modules Loaded vfat fat usb_storage cdc_ether usbnet cdc_acm nfs sha256 aes