Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
On Mon, Mar 07, 2011 at 12:43:03PM -0800, Dr. Ed Morbius wrote: OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). We're using megacli wrapped by perl to provide information about Perc events. It works quite well as far. -- Dominik Zyla pgp8bhjUch9zV.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
Dominik Zyla wrote on Thu, 10 Mar 2011 09:10:37 +0100: We're using megacli wrapped by perl to provide information about Perc events. It works quite well as far. Do you have a megacli rpm that works with the CentOS-provided drivers, which is MPT 3.something? I googled about this some time ago and there's an rpm mentioned here and there that contains only the megacli utility, but it's not downloadable anymore from anywhere. I got hold of a package that cotnains the 4 version, but that doesn't work with the CentOS drivers. LSI themselves provide only the complete MegaRAID driver/package for download and it's not clear if the singe megacli utility is included or if installing it may overwrite the built-in driver. Kai ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
On Thu, Mar 10, 2011 at 06:47:09PM +0100, Kai Schaetzl wrote: Dominik Zyla wrote on Thu, 10 Mar 2011 09:10:37 +0100: We're using megacli wrapped by perl to provide information about Perc events. It works quite well as far. Do you have a megacli rpm that works with the CentOS-provided drivers, which is MPT 3.something? I googled about this some time ago and there's an rpm mentioned here and there that contains only the megacli utility, but it's not downloadable anymore from anywhere. I got hold of a package that cotnains the 4 version, but that doesn't work with the CentOS drivers. LSI themselves provide only the complete MegaRAID driver/package for download and it's not clear if the singe megacli utility is included or if installing it may overwrite the built-in driver. It's some single binary version, compiled statically. -- Dominik Zyla pgp8OGyzxf3Vs.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
On Mar 7, 2011, at 3:43 PM, Dr. Ed Morbius dredmorb...@gmail.com wrote: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. I can't speak about nagios, but I have my OMSA setup to send traps, but for critical errors to also send emails and it works well for us. If you link the shared lib (forget the paths) and install megacli with --nodeps you can have both installed. -Ross ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] Dell PERC H800 commandline RAID monitoring tools
We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. Thanks in advance. -- Dr. Ed Morbius, Chief Scientist /| Robot Wrangler / Staff Psychologist| When you seek unlimited power Krell Power Systems Unlimited| Go to Krell! ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. if your system supports omreport (comes with omsa) then this is good solution: http://folk.uio.no/trondham/software/check_openmanage.html -- Eero ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
on 22:57 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. if your system supports omreport (comes with omsa) then this is good solution: http://folk.uio.no/trondham/software/check_openmanage.html So ... this slots on top of OMSA to provide reporting? -- Dr. Ed Morbius, Chief Scientist /| Robot Wrangler / Staff Psychologist| When you seek unlimited power Krell Power Systems Unlimited| Go to Krell! ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: on 22:57 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. if your system supports omreport (comes with omsa) then this is good solution: http://folk.uio.no/trondham/software/check_openmanage.html So ... this slots on top of OMSA to provide reporting? this plugin parsers omreport output and uses it for nagios output. omsa webserver is not required, but working omreport cli is. .. works great on my servers. -- Eero ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
Original Message Subject: [CentOS] Dell PERC H800 commandline RAID monitoring tools From: Dr. Ed Morbius dredmorb...@gmail.com To: CentOS User list centos@centos.org Date: Monday, March 07, 2011 2:43:03 PM We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. If you purchased the server with an add-in DRAC, the DRAC can provide email alerts if an array becomes degraded (or just about any other hardware fault). This isn't necessarily a replacement for your current monitoring, but it can be used to supplement or compliment it. --Blake ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
on 16:04 Mon 07 Mar, Blake Hudson (bl...@ispn.net) wrote: Original Message Subject: [CentOS] Dell PERC H800 commandline RAID monitoring tools From: Dr. Ed Morbius dredmorb...@gmail.com To: CentOS User list centos@centos.org Date: Monday, March 07, 2011 2:43:03 PM We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. If you purchased the server with an add-in DRAC, the DRAC can provide email alerts if an array becomes degraded (or just about any other hardware fault). This isn't necessarily a replacement for your current monitoring, but it can be used to supplement or compliment it. The iDRAC /doesn't/ report on RAID / storage configuration or status. iDRAC 6, Dell r610, onboard PERC H700, offboard PERC H800 (MD1200 array). BIOS version 2.1.15, Firmware 1.54 (Build 15). We get batteries, fans, intrusion, power, removable flash media, temps, and volts, but not storage.o The iDRAC is pretty good compared with some past Dell offerings. Ability to boot virtual media in particular is very slick (I can specify local removable storage or a drive image and mount it for booting / diagnostics remotely). But no RAID / storage management or monitoring. -- Dr. Ed Morbius, Chief Scientist /| Robot Wrangler / Staff Psychologist| When you seek unlimited power Krell Power Systems Unlimited| Go to Krell! ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
on 12:43 Mon 07 Mar, Dr. Ed Morbius (dredmorb...@gmail.com) wrote: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. Pardoning the self-reply, but one issue we've ahd is reconciling the omcontrol log report with the Dell Server Manager syslog messages. omcontrol reported a predictive drive failure, but we (and three Dell storage/support techs) had trouble identifying which actual device was being reporrted as bad. From 'omconfig storage controller action=exportlog controller=0' output: 03/04/11 21:42:42: EVT#02959-03/04/11 21:42:42: 96=Predictive failure: PD 00(e0x08/s2) 03/05/11 14:28:41: EVT#02961-03/05/11 14:28:41: 112=Removed: PD 00(e0x08/s2) In /var/log/messages (timestamp/hostname trimmed): Server Administrator: Storage Service EventID: 2243 The Patrol Read has stopped.: Controller 0 (PERC H800 Adapter) Server Administrator: Storage Service EventID: 2049 Physical disk removed: Physical Disk 0:0:2 Controller 0, Connector 0 The Server Administrator reports of a slot 2 failure correspond to the drive which was physically replaced. The OMSA omconfig report is throwing us a bunch of crud about some device, but Dell variously identified it as slot 0 and slot 9. We're now getting from them that /s2 identifies slot 2. Dell said point blank you're not going to have any luck with that as far as documentation of the OMSA log report format and parsing being documented. Does anyone have a clue as to WTF it's actaully trying to say, or what this tool is based off of (I'm suspecting mega-cli on a general hunch but not much stronger). Enterprise support indeed. -- Dr. Ed Morbius, Chief Scientist /| Robot Wrangler / Staff Psychologist| When you seek unlimited power Krell Power Systems Unlimited| Go to Krell! ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
on 23:15 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: on 22:57 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. if your system supports omreport (comes with omsa) then this is good solution: http://folk.uio.no/trondham/software/check_openmanage.html So ... this slots on top of OMSA to provide reporting? this plugin parsers omreport output and uses it for nagios output. Is it running/invoking omreport or relying on periodic runs? I'll dig through the docs but if you know this off-hand it'd be helpful. omsa webserver is not required, but working omreport cli is. .. works great on my servers. Good to know, much appreciated. -- Dr. Ed Morbius, Chief Scientist /| Robot Wrangler / Staff Psychologist| When you seek unlimited power Krell Power Systems Unlimited| Go to Krell! ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Dell PERC H800 commandline RAID monitoring tools
2011/3/8 Dr. Ed Morbius dredmorb...@gmail.com: on 23:15 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: on 22:57 Mon 07 Mar, Eero Volotinen (eero.voloti...@iki.fi) wrote: 2011/3/7 Dr. Ed Morbius dredmorb...@gmail.com: We're looking for tools to be used in monitoring the PERC H800 arrays on a set of database servers running CentOS 5.5. We've installed most of the OMSA (Dell monitoring) suite. Our current alerting is happening through SNMP, though it's a bit hit or miss (we apparently missed a couple of earlier predictive failure alerts on one drive). OMSA conflicts with mega-cli, though we may find that the latter is the more useful package. Both are pretty byzantine, the Dell stuff simply doesn't have docs (in particular: docs on how to interpret the omconfig log output). Ideally we'd like something which could be run as a Nagios plugin or cron job providing information on RAID status and/or possible disk errors. Probably both, actually. if your system supports omreport (comes with omsa) then this is good solution: http://folk.uio.no/trondham/software/check_openmanage.html So ... this slots on top of OMSA to provide reporting? this plugin parsers omreport output and uses it for nagios output. Is it running/invoking omreport or relying on periodic runs? I'll dig through the docs but if you know this off-hand it'd be helpful. It runs omreport each time nagios polls it via nrpe or snmp. -- Eero ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos