Re: RAID5 to RAID6 reshape?
On 17:40, Mark Hahn wrote: Question to other people here - what is the maximum partition size that ext3 can handle, am I correct it 4 TB ? 8 TB. people who want to push this are probably using ext4 already. ext3 supports up to 16T for quite some time. It works fine for me: [EMAIL PROTECTED]:~ # mount |grep sda; df /dev/sda; uname -a; uptime /dev/sda on /media/bia type ext3 (rw) FilesystemSize Used Avail Use% Mounted on /dev/sda 15T 7.8T 7.0T 53% /media/bia Linux ume 2.6.20.12 #3 SMP Tue Jun 5 14:33:44 CEST 2007 x86_64 GNU/Linux 13:44:29 up 236 days, 15:12, 9 users, load average: 10.47, 10.28, 10.17 Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: question about mdadm
On 23:46, Edward Hourigan wrote: mdadm --add /dev/md0 /dev/sdd mdadm --add /dev/md0 /dev/sde mdadm --grow --raid-devices=5 I forgot about the partitions, sdd1 and sde1, I'm wondering if this is going to be a problem, or if sdd and sde will work? It is currently reshaping the array. Please advise if possible. A response would be much appreciated! mdadm and the linux raid code don't care whether a component device of a raid array is a full disk or a partition of a disk. So your setup won't cause any problems. However, tools like fdisk which expect a partition table might get confused. You also might get some error messages at boot time about invalid partition tables. These are usually harmless though. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: raid6 check/repair
On 15:31, Bill Davidsen wrote: Thiemo posted metacode which I find appears correct, It assumes that _exactly_ one disk has bad data which is hard to verify in practice. But yes, it's probably the best one can do if both P and Q happen to be incorrect. IMHO mdadm shouldn't do this automatically though and should always keep backup copies of the data it overwrites with good data. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: raid5: degraded after reboot
On 10:38, Jon Nelson wrote: 4md: kicking non-fresh sda4 from array! what does that mean? sda4 was not included because the array has been assembled previously using only sdb4 and sdc4. So the data on sda4 is out of date. I also have this: raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2 RAID5 conf printout: --- rd:3 wd:2 fd:1 disk 1, o:1, dev:sdb4 disk 2, o:1, dev:sdc4 This looks normal. The array is up with two working disks. Why was /dev/sda4 kicked? Because it was non-fresh ;) md0 : active raid5 sda4[3] sdb4[1] sdc4[2] 613409664 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU] [==..] recovery = 13.1% (40423368/306704832) finish=68.8min speed=64463K/sec Seems like your init scripts re-added sda4. 65-70KB/s is about what these drives can do so the rebuild speed is just peachy. If the rebuild completes successfully, you're ok again. There's nothing you have to do. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: raid5: degraded after reboot
On 11:08, Jon Nelson wrote: sda4 was not included because the array has been assembled previously using only sdb4 and sdc4. So the data on sda4 is out of date. I don't understand - over months and months it has always been the three devices, /dev/sd{a,b,c}4. I've added and removed bitmaps and done other things but at the time it rebooted the array had been up, clean (non-degraded), and comprised of the three devices for 4-6 weeks. You said you had to reboot your box using sysrq. There are chances you caused the reboot while all pending data was written to sdb4 and sdc4, but not to sda4. So sda4 appears to be non-fresh after the reboot and, since mdadm refuses to use non-fresh devices, it kicks sda4. This looks normal. The array is up with two working disks. Two of three which, to me, is abnormal (ie, the normal state is three and it's got two). Sure. I should have said: It's normal if one disk in a raid5 array is missing (or non-fresh). Why was /dev/sda4 kicked? Because it was non-fresh ;) OK, but what does that MEAN? To be precise, it means that the event counter for sda4 is less than the event counter on the other devices in the array. So mdadm must assume the data on sda4 is out of sync and hence the device can't be used. If you are not using bitmaps, there is no other way out than syncing the whole device, i.e. writing good data (computed from sdb4 and sdc4) to sda4. Hope that helps. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: raid5: degraded after reboot
On 12:05, Jon Nelson wrote: Can mdadm be told to use non-fresh devices? Sure. --force does the trick. What about sdb4: I can understand rewinding an event count (sorta), but what does this mean: mdadm: forcing event count in /dev/sdb4(1) from 327615 upto 327626 Well, it means the event counter was advanced forward by 11 events. I'm not sure under which circumstances this message is printed though. Clearly, after a successful resync the event counter of the added disk is adjusted to match the value of the rest of the array. AFAIK, also assembling an array using --force would cause such an adjustment. Since the array is degraded, there are 11 events missing from sdb4 (presumably sdc4 had them). Since sda4 is not part of the array, the events can't be complete, can they? There's no such thing as a complete event. An event for example happens, when the array gets assembled, or if the superblock changes due to the user adding bitmap support. The event counter is simply a number which is written to each device of a raid array and which is increased whenever an event occurs. Note that changes to the underlying file system do not cause events, so the data on the disk may change completely although the event counter stays the same. Normally all devices contain the same count. But if you, for example, yank out a drive and assemble the array without that drive, the number on that drive isn't increased, obviously. If you plug in again the drive later and try to assemble the array, that drive has a lower event count than all other drives, i.e. it's non-fresh. Since any number of changes might have happened during the time the array was degraded the data on the non-fresh drive can not be trusted, which means it must not be used when assembling the array. So the drive is kicked and a full sync is necessary. Why jump *ahead* on events instead of rewinding? That wouln't buy you anything. No good idea. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: mismatch_cnt questions
On 00:21, H. Peter Anvin wrote: I have just updated the paper at: http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf ... with this information (in slightly different notation and with a bit more detail.) There's a typo in the new section: s/By assumption, X_z != D_n/By assumption, X_z != D_z/ Regards Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: raid1+raid0 mdadm
On 18:27, Dariusz Malec wrote: I can't configure mdadm. I have created 2 raid1 devices and i wanted to connect them with raid0. Everything is fine when i use command line, but after reboot mdadm starts only raid1 devices. Are you using udev? IIRC, there was an issue with stacked md devices like yours. Your problem might be that the device node for the final raid0 is missing. If your raid1's are md0 and md1, add mknod -m 600 /dev/md2 b 9 2 to your init scripts, just before calling mdadm. Of course, that's only an ugly workaround (that once worked for me). Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: A random initramfs script
On 07:50, Nix wrote: I suppose that if *every* filesystem hanging off / is its own fs, using rootfs as your / is not inefficient because there's still nothing in it. But it still makes me worry: what if some mad script makes a huge file in /? It's happened to me a couple of times, and because /var was on a different fs, all that happened was that / filled up and nothing bad resulted. If / was a ramfs (as rootfs is), you'd run out of memory... Yes, it's an additional piece of rope, and I already used it to shoot myself in the foot by doing a backup with rsync -a /home /mnt without mounting /mnt. First the machine went slow, then the OOM killer kicked in and killed everything. Finally the system was totally unresponsible and I had to use the so everything is unusual - boot thing. But only root can write to /mnt, and there are much simpler ways for root to halt the system.. [ip(8)] http://lartc.org/ describes many of its myriad extra features in more detail. All that stuff seems to be fairly old, linux-2.6. isn't mentioned at all and the cvs server doesn't work. Is it still up do date? Well, there's some extra stuff, but it's mostly on the iptables side: the advanced routing has mostly been stable since not just 2.4 but 2.2! So I downloaded iproute2-2.4.7-now-ss020116-try.tar.gz, but there seems to be a problem with errno.h: make[1]: Entering directory `/home/work/install/src/iproute2/lib' gcc -D_GNU_SOURCE -O2 -Wstrict-prototypes -Wall -g -I../include-glibc -I/usr/include/db3 -include ../include-glibc/glibc-bugs.h -I/home/install/w/linux/stable/include -I../include -DRESOLVE_HOSTNAMES -c -o libnetlink.o libnetlink.c distcc[13445] ERROR: compile /home/install/w/var/ccache/libnetlink.tmp.p133.13441.i on p133 failed libnetlink.c: In function `rtnl_dump_filter': libnetlink.c:149: error: `EINTR' undeclared (first use in this function) libnetlink.c:149: error: (Each undeclared identifier is reported only once libnetlink.c:149: error: for each function it appears in.) libnetlink.c: In function `rtnl_talk': libnetlink.c:248: error: `EINTR' undeclared (first use in this function) libnetlink.c: In function `rtnl_listen': libnetlink.c:350: error: `EINTR' undeclared (first use in this function) libnetlink.c: In function `rtnl_from_file': libnetlink.c:416: error: `EINTR' undeclared (first use in this function) make[1]: *** [libnetlink.o] Error 1 make[1]: Leaving directory `/home/work/install/src/iproute2/lib' make: *** [all] Error 2 mdev is `micro-udev', a 255-line tiny replacement for udev. It's part of busybox. Cool. Guess I'll have to update busybox.. done. The new busybox (from SVN) seems to work fine, just like the old one did. The init script doesn't use mdev yet, but from a first reading this is just a matter of translating /etc/udev/udev.conf to the mdev syntax. You don't need an mdev.conf at all; by default mdev creates a /dev with the KERNEL= names. All it's needed for is putting things in strange places or fiddling permissions, and that's not necessary for a boot initramfs :) Nice, and works like a charm. I just removed udev* from the initramfs. (I'd recommend managing the *real* /dev with udev, still; it's vastly more flexible... Yes, and it's needed for hotplugable devices anyway. but of course it's also about fifty times larger at a horrifying 50K plus 70K of rules... No need for such a huge rules file: # find /etc/udev/ -type f -printf '%f %s\n' udev.conf 768 udev.rules 5200 scsi-model.sh 1326 ide-model.sh 1201 You need SVN uClibc too (if you're using uClibc rather than glibc); older versions don't maintain the d_type field in the struct dirent, so mdev's scanning of /sys gets very confused and you end up with an empty /dev. Damn. I just compiled 0.9.28. Guess this one is too old. Yep. Of course the SVN release has broken binary compatibility, so you need to rebuild everything that depends on it (probably the cross-toolchain too, for safety). I scripted this long ago, of course, because it's a bit annoying otherwise... I tried to built the cross-compilation toolchain with Buildroot, but it didn't even start building because it couldn't download gcc from mirrors.rcn.net which appears to be down ATM. Isn't it possible to change the gcc mirror? I did not find a config option for that. Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: A random initramfs script
On 15:23, Neil Brown wrote: You shouldn't need portmap to mount an NFS filesystem unless you enable locking, That's news to me, thanks for pointing it out. But I do need portmap for mounting a NFS filesystem read-only (/usr, which contains portmap). Is that correct? He likes to compare the situation with /etc/fstab. Nobody complains about having to edit /etc/fstab, so why keep people complaining about having to edit /etc/mdadm.conf? Indeed! And if you plug in some devices off another machine for disaster recovery, you don't want another disaster because you assembled the wrong arrays. How is such a disaster possible, given each md device contains an ID for the array it belongs to? But yes, it is certainly a good idea to doublecheck everything before assembling the array in such a recovery situation. I would like an md superblock to be able to contain some indication of the 'name' of the machine which is meant to host the array, so that once a machine knows its own name, it can automatically find and mount its own arrays, but that isn't near the top of my list of priorities yet. How about a user-defined name? mdadm --create --name the_extra_noisy_array /dev/md0 --level... would use some fixed algorithm to compute a usual UUID for the new array from the string the_extra_noisy_array, and mdadm --assemble /dev/md0 --name the_extra_noisy_array could use the same algorithm and take into account only those devices which have a UUID equal to the computed one. Just a thought. Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A random initramfs script
On 00:41, Nix wrote: So I downloaded iproute2-2.4.7-now-ss020116-try.tar.gz, but there seems to be a problem with errno.h: Holy meatballs that's ancient. It is the most recent version on the ftp server mentioned in the HOWTO. Try http://developer.osdl.org/dev/iproute2/download/iproute2-2.6.15-060110.tar.gz for a rather newer and more capable copy. :) Much better. Thanks. This version works fine for me, just like busybox ip does. Yes, the initial population of /dev is done by firing messages at udevd *from a shell script*. It's gone all the way from devfs's kernel-space hardwiring to something sufficiently extensible that a shell script can do all the neessary stuff to populate /dev :) Yeah, Linux rulez :) [uClibc] Alternatively, just suck down GCC from, say, svn://gcc.gnu.org/svn/gcc/tags/gcc_3_4_5_release, or ftp.gnu.org, or somewhere, and point buildroot at that. Yep, there's a 'dl' directory which contains all downloads. One can download the tarballs from anywhere else to that directory. Seems to work now. Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A random initramfs script
On 20:56, Nix wrote: What I meant is that it never gets overmounted by a real rootfs. The rootfs never overmounts anything. Do you mean `it never gets overmounted by a real root filesystem'? Yes, that's what I tried to say. Sorry about the confusion. Basically I have three similar setups that work with the same init script: The main server which has all its filesystems on lvm. There is only one vg and only one pv, a raid 10 array. /bin, /etc, /home, /lib /root, /sbin, /tmp, /usr and /var are all lvs. They are mounted by the the init script of the initramfs. That script does the full boot and finally execs busybox init just do start the getty processes. A bunch of NFS clients. They boot similarly, but mount everything (except /etc) via NFS. The rescue system. Still the same, but does not mount anything as everything is contained in the initramfs. None of these have a real root filesystem. nfsroot support isn't even compiled in for the NFS clients. If so, that's a... most peculiar setup: I don't think I've ever heard of anything but systems meant to run on diskless kiosks and the occasional live cd running like that. It works pretty well. My diskless desktop is up for more than 100 days and the main server had a similar uptime before I rebooted some days ago (to try out dccp). ... but looking at init/initramfs.c, you should be safe as long as the initramfs can be loaded at all (on i386 I think there's a 16Mb limit on that but it depends on the arch and the bootloader and all sorts of horrid arcana: most arches have a much higher limit). I just booted a rescue system on i386 which is definitely larger than 16MB compressed. I always thought an N MB initramfs cuts about N MB of the main memory, plus some housekeeping data, but that's all. This view is probably a little bit too naive ;) It does, *unless* you switch root filesystems appropriately, that is, you unlink everything in the initramfs before `exec chroot'ing into the real root filesystem. In practice this has to be done by a dedicated C program in order to get away with deleting /dev and the very chroot binary itself :) and that's what busybox's switch_root does. In view of this chicken-egg problem it looks much cleaner to me to avoid the chroot :) ifconfig? route? ick. ip(8) is the wave of the future :) far more flexible and actually has a comprehensible command line as well. Thanks for the tip. I'll give it a try. http://lartc.org/ describes many of its myriad extra features in more detail. All that stuff seems to be fairly old, linux-2.6. isn't mentioned at all and the cvs server doesn't work. Is it still up do date? I try to avoid running daemons out of initramfs, because all those daemons share *no* inodes with anything else you'll ever run: more permanent memory load for as long as those daemons are running, although at least it's swappable load. Fair enough, but portmap is needed to mount NFS filesystems. How much memory would be saved if I'd kill portmap after the mount and restart it afterwards from a mounted filesystem? All of it. The only reason running daemons from initramfs wastes memory is that if you've got the binaries open/running and you delete them, in standard Unix fashion the space used by them won't be reclaimed until they're closed/stopped --- and the standard procedure for switching from an initramfs involves unlinking everything on the filesystem. There's *nothing* special about the rootfs filesystem on which initramfs runs except that it's a ramfs filesystem, and as such nonswappable, and that it's the start and end of the kernel's list of mount points (which means that unmounting it, mount --moving it, and so on, is impossible). It behaves just like a POSIX filesystem because it is one. In fact it's almost exactly the same as a tmpfs except that it's not swap-backed, and its maximum size can't be limited in the way tmpfs can. (tmpfs is a small hack atop ramfs.) Thanks for the explanation. I really appreciate it! so why is dynamic scanning the preferred method in LVM, yet discouraged in the md world? I see some conflicted messages here ;) I'm afraid, only Neil will be able to answer this. BTW: I agree with you and do not see the point in hardwiring the UUIDs either. mdev is `micro-udev', a 255-line tiny replacement for udev. It's part of busybox. Cool. Guess I'll have to update busybox.. done. The new busybox (from SVN) seems to work fine, just like the old one did. The init script doesn't use mdev yet, but from a first reading this is just a matter of translating /etc/udev/udev.conf to the mdev syntax. You need SVN uClibc too (if you're using uClibc rather than glibc); older versions don't maintain the d_type field in the struct dirent, so mdev's scanning of /sys gets very confused and you end up with an empty
Re: silent corruption with RAID1
On 18:40, Moses Leslie wrote: :00:0a.0 Unknown mass storage controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) Do you have Seagate drives? Some models have problems with this controller.. Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: silent corruption with RAID1
On 00:57, Moses Leslie wrote: On Sun, 26 Feb 2006, Andre Noll wrote: On 18:40, Moses Leslie wrote: :00:0a.0 Unknown mass storage controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) Do you have Seagate drives? Some models have problems with this controller.. They are indeed 300GB Seagate drives. Are there any workarounds? It seems odd that raid1 would be the only thing that has a problem. You could add your drive to the blacklist just to see if that makes any difference. The other two drives are western digitals, is there a blacklist somewhere I could check? Just look at the top of drivers/scsi/sata_sil.c Regards Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
mdadm-2.3.1 fails to hotadd device
# uname -r; mdadm -V; cat /proc/mdstat 2.6.15.4-gbbde1285 mdadm - v2.3.1 - 6 February 2006 Personalities : [raid1] [raid5] [raid6] md1 : active raid1 hdd1[12] sdi1[8] sda1[0] hda1[13] sdl1[11] sdk1[10] sdj1[9] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] 251776 blocks [14/14] [UU] unused devices: none # # mdadm /dev/md1 -f /dev/hdd1 mdadm: set /dev/hdd1 faulty in /dev/md1 # # mdadm /dev/md1 -r /dev/hdd1 mdadm: hot removed /dev/hdd1 # # mdadm /dev/md1 -a /dev/hdd1 mdadm: cannot find valid superblock in this array - HELP # # mdadm-1.12.0 /dev/md1 -a /dev/hdd1 mdadm: hot added /dev/hdd1 Any hints? Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm-2.3.1 fails to hotadd device
On 22:41, Neil Brown wrote: # mdadm /dev/md1 -a /dev/hdd1 mdadm: cannot find valid superblock in this array - HELP Can you strace -o /tmp/trace mdadm /dev/md1 -a /dev/hdd1 and send me /tmp/trace? Here it comes. Andre --- execve(/sbin/mdadm, [mdadm, /dev/md1, -a, /dev/hdd1], [/* 99 vars */]) = 0 uname({sys=Linux, node=raspe, ...}) = 0 brk(0) = 0x806a000 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory) open(/usr/lib/glibc-2.3.6/etc/ld.so.cache, O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=44713, ...}) = 0 mmap2(NULL, 44713, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f4d000 close(3)= 0 open(/lib/libc.so.6, O_RDONLY)= 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0O\1\000..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1175704, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f4c000 mmap2(NULL, 1162428, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e3 mmap2(0xb7f46000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x116) = 0xb7f46000 mmap2(0xb7f4a000, 7356, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f4a000 close(3)= 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e2f000 mprotect(0xb7f46000, 4096, PROT_READ) = 0 mprotect(0xb7f6d000, 4096, PROT_READ) = 0 set_thread_area({entry_number:-1 - 6, base_addr:0xb7e2f6b0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0xb7f4d000, 44713) = 0 time(NULL) = 1140955660 getpid()= 1392 brk(0) = 0x806a000 brk(0x808b000) = 0x808b000 open(/dev/md1, O_RDWR)= 3 fstat64(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(9, 1), ...}) = 0 ioctl(3, 0x800c0910, 0xbfd671d0)= 0 ioctl(3, 0x80480911, 0xbfd67310)= 0 stat64(/dev/hdd1, {st_mode=S_IFBLK|0660, st_rdev=makedev(22, 65), ...}) = 0 open(/dev/hdd1, O_RDONLY|O_EXCL) = 4 ioctl(4, BLKGETSIZE64, 0xbfd670f0) = 0 ioctl(4, BLKFLSBUF, 0) = 0 _llseek(4, 263061504, [263061504], SEEK_SET) = 0 read(4, \374N+\251\0\0\0\0Z\0\0\0\0\0\0\0\0\0\0\0a7h\220hTiC\1..., 4096) = 4096 close(4)= 0 fstat64(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(9, 1), ...}) = 0 ioctl(3, 0x800c0910, 0xbfd67090)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 lstat64(/dev, {st_mode=S_IFLNK|0777, st_size=5, ...}) = 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 ioctl(3, 0x80140912, 0xbfd672f0)= 0 write(2, mdadm: cannot find valid superbl..., 57) = 57 exit_group(1) = ? -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm-2.3.1 fails to hotadd device
On 22:41, Neil Brown wrote: On Sunday February 26, [EMAIL PROTECTED] wrote: # uname -r; mdadm -V; cat /proc/mdstat 2.6.15.4-gbbde1285 mdadm - v2.3.1 - 6 February 2006 Personalities : [raid1] [raid5] [raid6] md1 : active raid1 hdd1[12] sdi1[8] sda1[0] hda1[13] sdl1[11] sdk1[10] sdj1[9] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] 251776 blocks [14/14] [UU] You have a 14 drive raid1... cool # # mdadm /dev/md1 -a /dev/hdd1 mdadm: cannot find valid superblock in this array - HELP I took another look and added some fprintf's to Manage.c and util.c, see the patch below. The problem appears to be map_dev() always returning NULL because devlist is NULL. With the patch applied, mdadm /dev/md1 -a /dev/hdd1 gives me the following output: mdadm: st-maxdevs = 27 mdadm: major/minor/state: 8/1/6 devlist: (nil) mdadm: major/minor/state: 8/17/6 devlist: (nil) mdadm: major/minor/state: 8/33/6 devlist: (nil) mdadm: major/minor/state: 8/49/6 devlist: (nil) mdadm: major/minor/state: 8/65/6 devlist: (nil) mdadm: major/minor/state: 8/81/6 devlist: (nil) mdadm: major/minor/state: 8/97/6 devlist: (nil) mdadm: major/minor/state: 8/113/6 devlist: (nil) mdadm: major/minor/state: 8/129/6 devlist: (nil) mdadm: major/minor/state: 8/145/6 devlist: (nil) mdadm: major/minor/state: 8/161/6 devlist: (nil) mdadm: major/minor/state: 8/177/6 devlist: (nil) mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 3/1/6 devlist: (nil) mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: major/minor/state: 0/0/8 mdadm: cannot find valid superblock in this array - HELP --- diff -urpN mdadm-2.3.1/Manage.c mdadm-2.3.1-hacked/Manage.c --- mdadm-2.3.1/Manage.cMon Dec 5 05:52:22 2005 +++ mdadm-2.3.1-hacked/Manage.c Sun Feb 26 14:13:07 2006 @@ -236,6 +236,7 @@ int Manage_subdevs(char *devname, int fd if (array.not_persistent == 0) { + fprintf(stderr, Name : st-maxdevs = %d\n, st-max_devs); /* need to find a sample superblock to copy, and * a spare slot to use */ @@ -245,6 +246,7 @@ int Manage_subdevs(char *devname, int fd disc.number = j; if (ioctl(fd, GET_DISK_INFO, disc)) continue; + fprintf(stderr, Name : major/minor/state: %d/%d/%d\n, disc.major, disc.minor, disc.state); if (disc.major==0 disc.minor==0) continue; if ((disc.state 4)==0) continue; /* sync */ diff -urpN mdadm-2.3.1/util.c mdadm-2.3.1-hacked/util.c --- mdadm-2.3.1/util.c Tue Jan 31 00:43:03 2006 +++ mdadm-2.3.1-hacked/util.c Sun Feb 26 14:06:51 2006 @@ -412,6 +412,7 @@ char *map_dev(int major, int minor) devlist_ready=1; } + fprintf(stderr, devlist: %p\n, devlist); for (p=devlist; p; p=p-next) if (p-major == major p-minor == minor) { -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm-2.3.1 fails to hotadd device
On 08:37, Neil Brown wrote: Your '/dev' is a symlink, and mdadm doesn't seem to like that. Can you change the 'nftw' call in 'map_dev' in 'util.c' to nftw(/dev/., add_dev, 10, FTW_PHYS); and see if that helps? Jup, that does the trick: # ./mdadm /dev/md1 -a /dev/hdd1 mdadm: re-added /dev/hdd1 It isn't really the right fix, but it is a quick-and-dirty that might work. Is it considered an invalid configuration to have /dev being a symlink (to /udev as in my case)? Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm 2.2 segmentation fault
On 14:12, Stephan van Hienen wrote: when i try to start my raid with mdadm 2.2 it gives a segfault : See http://www.mail-archive.com/linux-raid@vger.kernel.org/msg03242.html -- Jesus not only saves, he also frequently makes backups - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
mdadm-2.2 SEGFAULT: mdadm --assemble --scan
sorry if this is already known/fixed: Assemble() is called from mdadm.c with the update argument equal to NULL: Assemble(ss, array_list-devname, mdfd, array_list, configfile, NULL, readonly, runstop, NULL, verbose-quiet, force); But in Assemble.c we have if (ident-uuid_set (!update strcmp(update, uuid)!= 0) ... which yields a segfault in glibc's strcmp(). Andre -- Jesus not only saves, he also frequently makes backups - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ANNOUNCE: mdadm 2.2 - A tool for managing Soft RAID under Linux
On 17:08, Neil Brown wrote: Release 2.2 fixes a few small bugs and add as few small elements of functionality. Possibly the most interesting is the addition of 'README.initramfs' and 'mkinitramfs'. Feedback on these would be most welcome. From README.initramfs: A minimal initramfs for assembling md arrays can be created using 3 files and one directory. These are: /bin Directory /bin/mdadm statically linked mdadm binary /bin/busybox statically linked busybox binary /bin/shhard link to /bin/busybox /init a shell script which call mdadm appropriately. Don't we need /dev/console as well? About the example script: == #!/bin/sh echo 'Auto-assembling boot md array' mkdir /proc mount -t proc proc /proc if [ -n $rootuuid ] then arg=--uuid=$rootuuid elif [ -n $mdminor ] then arg=--super-minor=$mdminor else arg=--super-minor=0 fi echo Using $arg mdadm -Acpartitions $arg --auto=part /dev/mda cd / mount /dev/mda1 /root || mount /dev/mda /root umount /proc cd /root exec chroot . /sbin/init /dev/console /dev/console 21 = (a) mkdir, mount, umount won't be found. 'busybox mkdir /proc' etc. does the job though. Or, create symlinks. (b) Does mdadm --auto=/dev create the /dev directory? If if does not, the script has to create it. Otherwise, the mdadm manpage should mention this ;) (c) Documentation/filesystems/ramfs-rootfs-initramfs.txt recommends to mount --move . / before the final chroot. There is also a trivial typo. Patch below. Have fun Andre --- mdadm-2.2/README.initramfs~ Tue Dec 6 13:57:22 2005 +++ mdadm-2.2/README.initramfs Tue Dec 6 13:57:46 2005 @@ -84,7 +84,7 @@ Some key points are: The --auto flag is given to mdadm so that it will create /dev/md* files automatically. This is needed as /dev will not contain - and md files, and udev will not create them (as udev only created device + any md files, and udev will not create them (as udev only created device files after the device exists, and mdadm need the device file to create the device). Note that the created md files may not exist in /dev of the mounted root filesystem. This needs to be deal with separately -- Jesus not only saves, he also frequently makes backups - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm 2.1: command line option parsing bug?
On 10:21, Neil Brown wrote: -a has the same problem (--add vs --auto). I'll see what I can do, gengetopt? (http://www.gnu.org/software/gengetopt/) Andre -- Jesus not only saves, he also frequently makes backups - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html