Re: ANNOUNCE: mdadm 2.4 - A tool for managing Soft RAID under Linux
Neil Brown wrote: I am pleased to announce the availability of mdadm version 2.4 It is available at the usual places: http://www.cse.unsw.edu.au/~neilb/source/mdadm/ and http://www.{countrycode}.kernel.org/pub/linux/utils/raid/mdadm/ mdadm is a tool for creating, managing and monitoring device arrays using the md driver in Linux, also known as Software RAID arrays. Release 2.4 primarily adds support for increasing the number of devices in a RAID5 array, which requires 2.6.17 (or some -rc or -mm prerelease). that's realy a long avaiting feature. but at the same time wouldn't it be finally possible to convert a non raid partition to an raid1? it's avery common thing and they used to said it's even working on windows:-( just my 2c. -- Levente Si vis pacem para bellum! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: with the latest mdadm
Neil Brown wrote: On Tuesday February 7, [EMAIL PROTECTED] wrote: hi, with the latest mdadm-2.3.1 we've got the following message: - md: md4: sync done. RAID1 conf printout: --- wd:2 rd:2 disk 0, wo:0, o:1, dev:sda2 disk 1, wo:0, o:1, dev:sdb2 md: mdadm(pid 8003) used obsolete MD ioctl, upgrade your software to use new ictls. - this is just a warning or some kind of problem with mdadm? This is with a 2.4 kernel, isn't it. no. it's rhel latest kernel 2.6.9-22.0.2.ELsmp. The md driver is incorrectly interpreting an ioctl that it doesn't recognise as an obsolete ioctl. In fact it is a new ioctl that 2.4 doesn't know about (I suspect it is GET_BITMAP_FILE). The message should probably be removed from 2.4, but as 2.4 is in deep-maintenance mode, I suspect that is unlikely it's 2.6:-( -- Levente Si vis pacem para bellum! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 write performance
Neil Brown wrote: The other is to use a filesystem that allows the problem to be avoided by making sure that the only blocks that can be corrupted are dead blocks. This could be done with a copy-on-write filesystem that knows about the raid5 geometry, and only ever writes to a stripe when no other blocks on the stripe contain live data. I've been working on a filesystem which does just this, and hope to have it available in a year or two (it is a back-ground 'hobby' project). why are you waiting so long? why not just release the project plan, and any pre-pre-alpha code? that's the point of the cathedral and the bazaar. may be others can help, find bugs, write code, etc.. -- Levente Si vis pacem para bellum! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: why the kernel and mdadm report differently
Neil Brown wrote: On Monday September 5, [EMAIL PROTECTED] wrote: hi, one of our raid array is crash all the time (once a week), Any kernel error messages? i already send report about this a few times to this list without any response, but i send you a private message will all logs. and one more stange thing that it's currently not working, kernel report inactive while mdadm said it's active, degraded. what's more we cant put this array into active state. Looks like you need to stop in (mdadm -S /dev/md2) and re-assemble it with --force: mdadm -A /dev/md2 -f /dev/sd[abcefgh]1 It looks like the computer crashed and when it came back up it was missing a drive. This situation can result in silent data corruption, which is why md won't automatically assemble it. When you do assemble it, you should at least fsck the filesystem, and possibly check for data corruption if that is possible. At least be aware that some data could be corrupt (there is a good chance that nothing is, but it is by no means certain). it works. but shouldn't it have to be both inactive or active? this is mdadm 1.12, just another site note there is no rpm for version 2.0:-( No. I seem to remember some odd compile issue with making the RPM and thinking I don't care. Maybe I should care a bit more would be useful. -- Levente Si vis pacem para bellum! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: why the kernel and mdadm report differently
David M. Strang wrote: Farkas Levente wrote: this is mdadm 1.12, just another site note there is no rpm for version 2.0:-( No. I seem to remember some odd compile issue with making the RPM and thinking I don't care. Maybe I should care a bit more would be useful. Not trying to be rude; but the install of mdadm is pathetically easy. Untar/Gzip the source; and type: make make install yes, but simply i don't like to put anything into the main filesystem which can't be checked later (ie. which package own it, it has the right checksum, etc..) and these are just with rpm. what's more i'd like to update packages automaticaly on all of our server from or local packages repository, so if i put mdadm into this list than all of our server do it without my manual install. -- Levente Si vis pacem para bellum! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sofware raid5 oops
anybody has any useful tip about it? yours. Original Message hi, after we switch our servers from centos-3 to centos-4 (aka. rhel-4) one of our server always crash once a week without any oops. this happneds with both the normal kernel-2.6.9-11.EL and kernel-2.6.9-11.106.unsupported. after we change the motherboard, the raid contorller and the cables too we still got it. finally we start netdump and last but not least yesterday we got a crash log and a core file. it seems there is a bug in the raid5 code of the kernel. this is our backup server with 8 x 200GB hdd in a raid5 (for the data) plus 2 x 40GB hdd in raid1 (for the system) with 3ware 8xxx raid contorller, running. i attached the netdump log of the last crash. how can i fix it? yours. -- Levente Si vis pacem para bellum! RAID5 conf printout: --- rd:8 wd:8 fd:0 disk 0, o:1, dev:sda1 disk 1, o:1, dev:sdb1 disk 2, o:1, dev:sdc1 disk 3, o:1, dev:sdd1 disk 4, o:1, dev:sde1 disk 5, o:1, dev:sdf1 disk 6, o:1, dev:sdg1 disk 7, o:1, dev:sdh1 Unable to handle kernel NULL pointer dereference at virtual address printing eip: *pde = 0f94a067 Oops: [#1] Modules linked in: cifs nls_utf8 ncpfs nfsd exportfs lockd sunrpc parport_pc lp parport netconsole netdump i2c_dev i2c_core ipx dm_mod e1000 tg3 floppy ext3 jbd raid5 xor raid1 3w_ sd_mod scsi_mod CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00010246 (2.6.9-11.106.unsupported) EIP is at 0x0 eax: c1806138 ebx: c018961c ecx: 0016 edx: c035c7f4 esi: e7182200 edi: 0001 ebp: c18fb380 esp: f7878f34 ds: 007b es: 007b ss: 0068 Process md2_raid5 (pid: 224, threadinfo=f7878000 task=f7872600) Stack: f7b973c0 f8879a26 md_thread+0x20d/0x23a [c011ceaf] autoremove_wake_function+0x0/0x2d [c030ce1a] ret_from_fork+0x6/0x14 [c011ceaf] autoremove_wake_function+0x0/0x2d [c02a183f] md_thread+0x0/0x23a [c01041d9] kernel_thread_helper+0x5/0xb Code: Bad EIP value. Pid: 224, comm:md2_raid5 EIP: 0060:[] CPU: 0 EIP is at 0x0 EFLAGS: 00010246Not tainted (2.6.9-11.106.unsupported) EAX: c1806138 EBX: c018961c ECX: 0016 EDX: c035c7f4 ESI: e7182200 EDI: 0001 EBP: c18fb380 DS: 007b ES: 007b CR0: 8005003b CR2: ffd5 CR3: 0fd6b000 CR4: 06d0 [f8879a26] handle_stripe+0xfca/0x1207 [raid5] [f887a7d5] raid5d+0x197/0x2ab [raid5] [c02a1a4c] md_thread+0x20d/0x23a [c011ceaf] autoremove_wake_function+0x0/0x2d [c030ce1a] ret_from_fork+0x6/0x14 [c011ceaf] autoremove_wake_function+0x0/0x2d [c02a183f] md_thread+0x0/0x23a [c01041d9] kernel_thread_helper+0x5/0xb sibling task PC pid father child younger older init S C01458E9 920 1 0 2 (NOTLB) f7f44eb0 0086 0055 xfrm_state_flush+0x2/0x289 tcp_poll+0x31/0x144 [c01768e1] do_select+0x347/0x378 [c0176461] __pollwait+0x0/0x94 [c0176c05] sys_select+0x2e0/0x43a [c030cefb] syscall_call+0x7/0xb ntpd S 00D0 2516 2196 1 2219 2172 (NOTLB) f0885eb0 0082 0246 00d0 cf8553a0 21cd 5197434f 3abd f697d2a0 f697d42c f6a0d580 7fff f0885f74 c030b7e5 f69a5980 f0885f58 f69a5980 f106ed18 c017648e 0246 f106b800 f0885f58 Call Trace: [c030b7e5] schedule_timeout+0x50/0x10c [c017648e] __pollwait+0x2d/0x94 [c02aeeac] datagram_poll+0x25/0xd1 [c01768e1] do_select+0x347/0x378 [c0176461] __pollwait+0x0/0x94 [c0176c05] sys_select+0x2e0/0x43a [c01058d8] sys_sigreturn+0x1ce/0x1f2 [c030cefb] syscall_call+0x7/0xb rpc.rquotad S 3416 2219 1 2223 2196 (NOTLB) f65a4f1c 0082 0001 f697ccd0 00030ee2 966e2a43 0033 f69b1320 f69b14ac 7fff f65a4fa0 f0888ba0 c030b7e5 f106b580 f65a4fa0 f106e518 c02aeeac f6a7a780 c03806c0 0145 f0888bb0 0001 Call Trace: [c030b7e5] schedule_timeout+0x50/0x10c [c02aeeac] datagram_poll+0x25/0xd1 [c02a8f2c] sock_poll+0x12/0x14 [c0176db3] do_pollfd+0x54/0x77 [c0176e63] do_poll+0x8d/0xab [c0177020] sys_poll+0x19f/0x24f [c0176461] __pollwait+0x0/0x94 [c030cefb] syscall_call+0x7/0xb nfsd S FF4DCFB0 2316 2223 1 2224 2219 (L-TLB) f05fff10 0046 0002 ff4dcfb0 f69c4dd0 13cc c85b7bd5 37c8 f69b0d50 f69b0edc 03db8cae 03db8cae 000b c1993c00 c030b886 f5675f18 c035b0d0 03db8cae 1d244b3c 0005 c031a0b5 c031c25c 00a8 Call Trace: [c030b886] schedule_timeout+0xf1/0x10c [c0129336] process_timeout+0x0/0x5 [f8ade6bb] svc_recv+0x325/0x65b [sunrpc] [c011b856] default_wake_function+0x0/0xc [c011b921] __wake_up+0x6e/0xca [c011b856] default_wake_function+0x0/0xc [c012d587] sigprocmask+0x140/0x1f4 [f8b2e44d] nfsd+0x1ae/0x540 [nfsd] [f8b2e29f] nfsd+0x0/0x540 [nfsd] [c01041d9] kernel_thread_helper+0x5/0xb nfsd S 37C3 3472 2224 1 2225 2223 (L-TLB) f1513f10