Archive?
Hi! I'm new to this list and have some questions. However I'd like to first browse through the list archive if it's available somewhere. Is it? :) Thanks, D.
Monitoring?
Hi! Ok, found an archive, but haven't found the questions/answers I was hoping to find. I have a RAID1 setup with kernel 2.2.13 and appropriate patches for 2.2.11 (only two files didn't patch correctly, as they were already patched in 2.2.13) and raidtools-0.90. Everything works nice, even hot-swapping disks (with hot-pluggable SCSI backplane and some caution, of course) didn't cause a problem. However, are there any tools already available to monitor the md device and notify the administrator via mail, modem, pager etc.? Thanks, D.
Re: Monitoring?
On Fri, Nov 12, 1999 at 12:18:44PM +0100, Danilo Godec wrote: Hi! Ok, found an archive, but haven't found the questions/answers I was hoping to find. I have a RAID1 setup with kernel 2.2.13 and appropriate patches for 2.2.11 (only two files didn't patch correctly, as they were already patched in 2.2.13) and raidtools-0.90. Everything works nice, even hot-swapping disks (with hot-pluggable SCSI backplane and some caution, of course) didn't cause a problem. However, are there any tools already available to monitor the md device and notify the administrator via mail, modem, pager etc.? It should be fairly simple to grep for underscores in /proc/mdstat using cron+{perl,grep,whatever} and send a mail if one is found. When a disk dies it is marked in /proc/mdstat like [UU_U]. -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...:
Re: Monitoring?
On Fri, 12 Nov 1999, [iso-8859-1] Jakob stergaard wrote: It should be fairly simple to grep for underscores in /proc/mdstat using cron+{perl,grep,whatever} and send a mail if one is found. When a disk dies it is marked in /proc/mdstat like [UU_U]. Thanks, I think I will do that. Now for another question: I have a hot-swappable SCSI backplane, so I simulated a dead disk by simply removing it (while there was no I/O activity). If I umount /dev/md0 and stop it (raidstop /dev/md0), I can use /proc/scsi/scsi and first remove the dead-disk entry and then add a new disk (echo "scsi [remove|add]-single-device 0 0 1 0" /proc/scsi/scsi). Then, I can raidhotadd the new disk to /dev/md0 and the world is nice. However, is there a way to do all this while raid1 is stil active? So that users never have to notice something went wrong with disks? Thanks, D. PS: I thought of adding the new disk with some other ID, but the backplane has fixed IDs so I cannot change them (disk0= ID 0, disk1= ID 1).
Re: Monitoring?
hi danilo I've collected some monitoring tools/scripts people have hacked... http://www.linux-consulting.com/Raid/docs/raid_monitor* I also reviewsed/collected some regular monitoring tools http://www.linux-consulting.com/Monitor/monitor.pl about 40 lines down is a list of various prog/scripts... have fun raiding alvin Danilo Godec wrote: Hi! Ok, found an archive, but haven't found the questions/answers I was hoping to find. I have a RAID1 setup with kernel 2.2.13 and appropriate patches for 2.2.11 (only two files didn't patch correctly, as they were already patched in 2.2.13) and raidtools-0.90. Everything works nice, even hot-swapping disks (with hot-pluggable SCSI backplane and some caution, of course) didn't cause a problem. However, are there any tools already available to monitor the md device and notify the administrator via mail, modem, pager etc.? Thanks, D.
Re: Monitoring?
Hi, However, are there any tools already available to monitor the md device and notify the administrator via mail, modem, pager etc.? The most simple thing is to do a crontab entry with a daily cat /proc/mdstat (std output will get mailed). You could additionally use grep and look if [] is there or NOT !? If you are looking for sth nice to put on your screen (on the server screen or on a workstations screen by redirecting X output via network): I recently did some work on xosview 1.7.1 to additionally show a RAID status monitor with operational disks (green), unoperational disks (red) and rebuild status (growing bar if in progress or green field if idle). While doing that, I noticed that /proc/mdstat is not nice to parse (eat least my own [working] parser code looked quite bad) and wrote a kernel patch to have some format that parses easier. The stuff already works for RAID1 and RAID5 (RAID0 doesn't make much sense because there is no rebuild and because you WILL notice disk failures easily without xosview ;-| ), but it depends on my kernel RAID patch for /proc/mdstat. I submitted the stuff to Mike Romberg (xosview) and Ingo Molnar (RAID). Mike did some changes and integrated it in xosview development tree (thanks!). Ingo said he'll have a look at it (he seems to be quite busy nowadays working on RAID stuff. Thanks, too!). But: There are some open questions, see next mail. So if you are looking for some nice X11 stuff, then just wait a bit and look out for new xosview and raidpatch stuff ... Thomas -- Thomas Waldmann (com_ma, Computer nach Masz) email: [EMAIL PROTECTED] www: www.com-ma.de Please be patient if sending me email, response may be slow. Bitte Geduld, wenn Sie mir email senden, Antwort kann dauern.
RE: Monitoring?
Behalf Of Jakob Østergaard Sent: Friday, November 12, 1999 3:42 AM On Fri, Nov 12, 1999 at 12:18:44PM +0100, Danilo Godec wrote: Hi! However, are there any tools already available to monitor the md device and notify the administrator via mail, modem, pager etc.? It should be fairly simple to grep for underscores in /proc/mdstat using cron+{perl,grep,whatever} and send a mail if one is found. When a disk dies it is marked in /proc/mdstat like [UU_U]. But that is not the ONLY underscore; [root@raven src]# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] read_ahead 1024 sectors md0 : active raid1 hdd1[1] hdc1[0] 6297344 blocks [2/2] [UU] unused devices: none [root@raven src]# Please note the name "read_ahead".
RE: Monitoring?
Title: RE: Monitoring? } -Original Message- } From: Roeland M.J. Meyer [mailto:[EMAIL PROTECTED]] } But that is not the ONLY underscore; } } [root@raven src]# cat /proc/mdstat } Personalities : [linear] [raid0] [raid1] [raid5] } read_ahead 1024 sectors } md0 : active raid1 hdd1[1] hdc1[0] 6297344 blocks [2/2] [UU] } unused devices: none } [root@raven src]# } } Please note the name read_ahead. then 'grep md | grep _' or perl, awk etc equivalents. Stuart
bad example of linear raidtab in HOWTO
In section 3.2 of the howto that comes with raidtools there is an example of a linear raidtab setup. This example doesnt have a line stating the chunk-size If i try to create my linear raid without stating a chunk-size then i get /dev/md0: Invalid argument and mkraid fails. AFAIK chunk sizes dont apply to linear raid so maybe it shouldnt be necessary to state in raidtab. In any case i guess the HOWTO should match the raidtools behaviour. I hope this info would be usefull for next time the HOWTO was updated. Thanks Glenn McGrath
RE: Monitoring?
On Fri, 12 Nov 1999, Roeland M.J. Meyer wrote: It should be fairly simple to grep for underscores in /proc/mdstat using cron+{perl,grep,whatever} and send a mail if one is found. When a disk dies it is marked in /proc/mdstat like [UU_U]. But that is not the ONLY underscore; Personalities : [linear] [raid0] [raid1] [raid5] read_ahead 1024 sectors md0 : active raid1 hdd1[1] hdc1[0] 6297344 blocks [2/2] [UU] Please note the name "read_ahead". OK, then grep for md[0-15] first and then grep for an underscore. Jason Clifford http://www.definitelinux.com/ Definite Linux - The UK's leading distribution with crypto support Linux Workstation and Server Systems
Re: Monitoring?
[ Friday, November 12, 1999 ] Roeland M.J. Meyer wrote: But that is not the ONLY underscore; --- drivers/block/md.c.orig Fri Nov 12 10:59:44 1999 +++ drivers/block/md.c Fri Nov 12 10:59:59 1999 @@ -884,7 +884,7 @@ sz+=sprintf (page+sz, "[%d %s] ", i, pers[i]-name); page[sz-1]='\n'; - sz+=sprintf (page+sz, "read_ahead "); + sz+=sprintf (page+sz, "read-ahead "); if (read_ahead[MD_MAJOR]==INT_MAX) sz+=sprintf (page+sz, "not set\n"); else
Re: Tuning readahead
Attached is a program that will let you get or set the read ahead value for any major device. You can easily change the value and then do a performance test. Lance. Jakob Østergaard wrote: Hi all ! I was looking into tuning the readahead done on disks in a RAID. It seems as though (from md.c) that levels 0, 4 and 5 are handled in similar ways. The readahead is set to chunk_size*4 per disk, and then increased to 1024*MAX_SECTORS = 1024*128 = 128k if the above equation yielded a result lower than this. So besides from changing the chunk size to something bigger, is there any way the readahead can be tuned ? Should (and could I safely) just change the equation in md.c ? readahead.c
Re: kernel SW-RAID implementation questions
There is a constant specifying the maximum number of md devices. But, there is no variable stating how many active md devices are around. This wouldn't make much sense anyway since the md devices are not allocated sequentially. You can start with md3, for example. You can have a program analyze the /proc/mdstat file to see what md device numbers are currently active and thus not available for new devices. Lance. Thomas Waldmann wrote: Is there a variable containing the md device count (md0, md1, ..., mdn. n == ?) ?
Re: Monitoring?
On Fri, Nov 12, 1999 at 03:43:34PM -, Adamson, Stuart wrote: } -Original Message- } From: Roeland M.J. Meyer [mailto:[EMAIL PROTECTED]] } But that is not the ONLY underscore; } } [root@raven src]# cat /proc/mdstat } Personalities : [linear] [raid0] [raid1] [raid5] } read_ahead 1024 sectors } md0 : active raid1 hdd1[1] hdc1[0] 6297344 blocks [2/2] [UU] } unused devices: none } [root@raven src]# } } Please note the name "read_ahead". then 'grep md | grep _' or perl, awk etc equivalents. Stuart Actually .. I would be thinking regular expressions would be usefull right about here. .. something like "md[0-9]+" instead of matching for md alone. -- "That is precisely what common sense is for, to be jarred into uncommon sense." ++ Eric Temple Bell, Mathmatics: Queen of the Sciences Mark Ferrell: Major'Trips' Lead Programmer : Chaotic Dreams Development Team URL : http://www.planetquake.com/chaotic E-Mail : [EMAIL PROTECTED]
Re: Monitoring?
"Roeland M.J. Meyer" wrote: But that is not the ONLY underscore; Please note the name "read_ahead". So grep for an underscore with a [ or a U in front of it: grep '[\[U]_' /proc/mdstat -- Mike Marion - Unix SysAdmin/Engineer, Qualcomm Inc. There's even a parody for people opposed to hunting: Deer Avenger. In it, bazooka-toting deer lure potbellied hunters to their death with such "genuine hunter calls" as a feminine cry of "Help, I'm naked, and I have a pizza." - Joshua Quittner, in an article on the Hunting computer game craze in the 12/7/98 issue of Time
Re: Tuning readahead
On Fri, Nov 12, 1999 at 10:15:40AM -0700, D. Lance Robinson wrote: Attached is a program that will let you get or set the read ahead value for any major device. You can easily change the value and then do a performance test. Fantastic ! Thanks a lot ! I just tried it out, but I haven't done any measuring yet... Is this program available anywhere via. FTP or so ? If I want to refer to it in the HOWTO, how should I go about doing that ? -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...:
suscribe
[UPDATE] Re: LOTS OF BAD STUFF ...
On Wed, 10 Nov 1999, Doug Ledford wrote: Well, I've never gotten a single SCSI error from the controller... not to mention that the block being requested is WAY beyond the end of the device. If this wasn't a RAID device, this would be one of the 'Attempt to access beyond end of device' errors that non-raid users have reported many times for the 2.2 series kernels. I have also gotten the error when not under any load, about once a month or so, but never with the alarming frequency of last night! it's 99.99% a problem with the disk. The RAID0 code has not had any significant changes (due to it's simplicity) in the last couple of years. We never rule out software bugs, but this is one of those cases where it's way, way down in the list of potential problem sources. OK, since this raid is on an U2W card, then my current patch set might help. There was a bug in some U2W hardware that was found by Justin Gibbs (sorry, I don't have the tools to be able to identify some of these types of bugs) where a bit on the cards DFSTATUS register was getting set early. It would correct itself within 5 cycles, so the fix was to test the bit 5 times and see if it *still* was set. That would catch the glitch. Go figure. Anyway, I'm attached a test patch for you to try out. If it fixes your problem then it's a likely candidate as the final 5.1.21 driver patch. I'm now running a patched 2.2.13 kernel with this patch (plus Andrea's ext2 race patch from a week ago, and the raid0145 patch). I also replaced one of the drives in the array, based practially on a guess. The system has only been up 24 hours, but has withstood some high load, with no recurrence of the problem. I wish I had done these changes in a more controlled way (one at a time) so as to verify that one of them has made a difference, but I needed stable and these three fixes (the U2W patch, the ext2 race patch, and the disk replocement) all seemed only likely to increase my changes. Thanks for your help, I'll let you know if anything else surfaces. David -- /==\ | David Mansfield | | [EMAIL PROTECTED] | \==/
I2O support?
Hi guys - could someone please tell me if any Linux kernel currently has support for I2O (Intelligent Input/Output)? Sorry to cross-post this - but I need to know urgently... While I'm here - does anyone know what is a decent amount of I20 Cache for a RAID controller? We only seem to have 16Mb out of a possible 128Mb.. it has been suggested that this could be the cause of an IO bottleneck we're having with an HP NetServer LH4... Regards, Matthew Clark. -- NetDespatch Ltd - The Internet Call Centre. http://www.netdespatch.com