[ANNOUNCE] Einarc - universal RAID management/monitoring tool

2007-11-28 Thread Mikhail Yakshin

Hello!

I'd like to introduce release of Einarc project - an universal RAID 
management/monitoring tool.


Some time ago, I've started working with various hardware RAID 
controllers and quickly was disappointed by the fact that almost all of 
them require proprietary utilities to manage. For example, it's 
virtually impossible to monitor RAID status from running Linux without 
installing one on Areca, Adaptec, LSI, 3ware, CCISS, etc controllers.


Moreover, all these proprietary utilities were sometimes hard to find 
and every one of the have a different interface, command line options, 
etc. There seemed to be even no standard for entities hierarchy: Areca 
uses 3-tier hierarchy (physical discs - raidsets - volumesets), LSI uses 
2-tier one (physical - logical discs).


So, after giving it some thought and searching for a solution to unite 
them all, I've found out that there almost no such thing. ManageEngine 
OpStor is a proprietary and pretty expensive product that can do it and 
only other attempts in this area seemed to be OpenBSD's bioctl 
initiative that faded out in 2005-2006.


Generally, the idea is simple: Einarc works as a translator that makes 
it possible for a user to control all these devices using simple terms 
like “physical disc”, “logical disc”, “adapter”, etc, while 
transparently converting these requests to proprietary RAID paradigms. 
In fact, the system still uses underlying proprietary CLIs, but the user 
doesn’t interact with them directly, staying in a single, 
well-documented interface.


During its installation, it automatically downloads these proprietary 
utils (after reading  agreeing to their licenses), unpacks and installs 
them, and then, for example, command like


einarc -t areca physical list

would be translated into

/usr/local/lib/einarc/areca/cli disk info

then it's result parsed and returned in the same well-documented form as 
for any other Einarc-supported adapter.


I'd be really happy to hear any opinions / thoughts / feature requests 
on this project and I hope it would come useful to someone :) Einarc can 
be reached at


http://www.inquisitor.ru/doc/einarc/

for downloading and documentation.

--
WBR, Mikhail Yakshin AKA GreyCat
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape/resync

2007-11-28 Thread Neil Brown
On Sunday November 25, [EMAIL PROTECTED] wrote:
 - Message from [EMAIL PROTECTED] -
  Date: Sat, 24 Nov 2007 12:02:09 +0100
  From: Nagilum [EMAIL PROTECTED]
 Reply-To: Nagilum [EMAIL PROTECTED]
   Subject: raid5 reshape/resync
To: linux-raid@vger.kernel.org
 
  Hi,
  I'm running 2.6.23.8 x86_64 using mdadm v2.6.4.
  I was adding a disk (/dev/sdf) to an existing raid5 (/dev/sd[a-e] - md0)
  During that reshape (at around 4%) /dev/sdd reported read errors and
  went offline.

Sad.

  I replaced /dev/sdd with a new drive and tried to reassemble the array
  (/dev/sdd was shown as removed and now as spare).

There must be a step missing here.
Just because one drive goes offline, that  doesn't mean that you need
to reassemble the array.  It should just continue with the reshape
until that is finished.  Did you shut the machine down or did it crash
or what

  Assembly worked but it would not run unless I use --force.

That suggests an unclean shutdown.  Maybe it did crash?


  Since I'm always reluctant to use force I put the bad disk back in,
  this time as /dev/sdg . I re-added the drive and could run the array.
  The array started to resync (since the disk can be read until 4%) and
  then I marked the disk as failed. Now the array is active, degraded,
  recovering:

It should have restarted the reshape from whereever it was up to, so
it should have hit the read error almost immediately.  Do you remember
where it started the reshape from?  If it restarted from the beginning
that would be bad.

Did you just --assemble all the drives or did you do something else?

 
  What I find somewhat confusing/disturbing is that does not appear to
  utilize /dev/sdd. What I see here could be explained by md doing a
  RAID5 resync from the 4 drives sd[a-c,e] to sd[a-c,e,f] but I would
  have expected it to use the new spare sdd for that. Also the speed is

md cannot recover to a spare while a reshape is happening.  It
completes the reshape, then does the recovery (as you discovered).

  unusually low which seems to indicate a lot of seeking as if two
  operations are happening at the same time.

Well reshape is always slow as it has to read from one part of the
drive and write to another part of the drive.

  Also when I look at the data rates it looks more like the reshape is
  continuing even though one drive is missing (possible but risky).

Yes, that is happening.

  Can someone relief my doubts as to whether md does the right thing here?
  Thanks,

I believe it is do the right thing.

 
 - End message from [EMAIL PROTECTED] -
 
 Ok, so the reshape tried to continue without the failed drive and  
 after that resynced to the new spare.

As I would expect.

 Unfortunately the result is a mess. On top of the Raid5 I have  

Hmm.  This I would not expect.

 dm-crypt and LVM.
 Although dmcrypt and LVM dont appear to have a problem the filesystems  
 on top are a mess now.

Can you be more specific about what sort of mess they are in?

NeilBrown


 I still have the failed drive, I can read the superblock from that  
 drive and up to 4% from the beginning and probably backwards from the  
 end towards that point.
 So in theory it could be possible to reorder the stripe blocks which  
 appears to have been messed up.(?)
 Unfortunately I'm not sure what exactly went wrong or what I did  
 wrong. Can someone please give me hint?
 Thanks,
 Alex.
 
 
 #_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
 #   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
 #  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
 # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
 #   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #
 
 
 
 
 cakebox.homeunix.net - all the machine one needs..
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid6 check/repair

2007-11-28 Thread Neil Brown
On Thursday November 22, [EMAIL PROTECTED] wrote:
 Dear Neil,
 
 thank you very much for your detailed answer.
 
 Neil Brown wrote:
  While it is possible to use the RAID6 P+Q information to deduce which
  data block is wrong if it is known that either 0 or 1 datablocks is 
  wrong, it is *not* possible to deduce which block or blocks are wrong
  if it is possible that more than 1 data block is wrong.
 
 If I'm not mistaken, this is only partly correct.  Using P+Q redundancy,
 it *is* possible, to distinguish three cases:
 a) exactly zero bad blocks
 b) exactly one bad block
 c) more than one bad block
 
 Of course, it is only possible to recover from b), but one *can* tell,
 whether the situation is a) or b) or c) and act accordingly.

It would seem that either you or Peter Anvin is mistaken.

On page 9 of 
  http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
at the end of section 4 it says:

  Finally, as a word of caution it should be noted that RAID-6 by
  itself cannot even detect, never mind recover from, dual-disk
  corruption. If two disks are corrupt in the same byte positions,
  the above algorithm will in general introduce additional data
  corruption by corrupting a third drive.

 
 The point that I'm trying to make is, that there does exist a specific
 case, in which recovery is possible, and that implementing recovery for
 that case will not hurt in any way.

Assuming that it true (maybe hpa got it wrong) what specific
conditions would lead to one drive having corrupt data, and would
correcting it on an occasional 'repair' pass be an appropriate
response?

Does the value justify the cost of extra code complexity?

 
  RAID is not designed to protect again bad RAM, bad cables, chipset 
  bugs drivers bugs etc.  It is only designed to protect against drive 
  failure, where the drive failure is apparent.  i.e. a read must 
  return either the same data that was last written, or a failure 
  indication. Anything else is beyond the design parameters for RAID.
 
 I'm taking a more pragmatic approach here.  In my opinion, RAID should
 just protect my data, against drive failure, yes, of course, but if it
 can help me in case of occasional data corruption, I'd happily take
 that, too, especially if it doesn't cost extra... ;-)

Everything costs extra.  Code uses bytes of memory, requires
maintenance, and possibly introduced new bugs.  I'm not convinced the
failure mode that you are considering actually happens with a
meaningful frequency.

NeilBrown

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid6 check/repair

2007-11-28 Thread Neil Brown
On Tuesday November 27, [EMAIL PROTECTED] wrote:
 Thiemo Nagel wrote:
  Dear Neil,
 
  thank you very much for your detailed answer.
 
  Neil Brown wrote:
  While it is possible to use the RAID6 P+Q information to deduce which
  data block is wrong if it is known that either 0 or 1 datablocks is 
  wrong, it is *not* possible to deduce which block or blocks are wrong
  if it is possible that more than 1 data block is wrong.
 
  If I'm not mistaken, this is only partly correct.  Using P+Q redundancy,
  it *is* possible, to distinguish three cases:
  a) exactly zero bad blocks
  b) exactly one bad block
  c) more than one bad block
 
  Of course, it is only possible to recover from b), but one *can* tell,
  whether the situation is a) or b) or c) and act accordingly.
 I was waiting for a response before saying me too, but that's exactly 
 the case, there is a class of failures other than power failure or total 
 device failure which result in just the one identifiable bad sector 
 result. Given that the data needs to be read to realize that it is bad, 
 why not go the extra inch and fix it properly instead of redoing the p+q 
 which just makes the problem invisible rather than fixing it.
 
 Obviously this is a subset of all the things which can go wrong, but I 
 suspect it's a sizable subset.

Why do think that it is a sizable subset.  Disk drives have internal
checksum which are designed to prevent corrupted data being returned.

If the data is getting corrupt on some buss between the CPU and the
media, then I suspect that your problem is big enough that RAID cannot
meaningfully solve it, and New hardware plus possibly restore from
backup would be the only credible option.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html