Re: Question: array locking, possible?

2006-02-08 Thread Chris Osicki
On Tue, 07 Feb 2006 10:16:20 -0800
Mike Hardy [EMAIL PROTECTED] wrote:

 
 
 Chris Osicki wrote:
 
  
  To rephrase my question, is there any way to make it visible to the
  other host that the array is up an running on the this host?
  
  Any comments, ideas?
 
 Would that not imply an unlock command before you could run the array
 on the other host?

Yes, it would. I was thinking about an advisory lock, and a well
known -f option for those who know what they are doing ;-)

 
 Would that not then break the automatic fail-over you want, as no
 machine that died or hung would issue the unlock command, meaning that
 the fail-over node could not then use the disks

If I trust my cluster software it's not a problem, I use the -f.
My concern is as I said accidentally array activation on the other node.

 
 It's an interesting idea, I just can't think of a way to make it work
 unattended

 
 It might be possible wrap the 'mdadm' binary with a script that checks
 (maybe via some deep check using ssh to execute remote commands, or just
 a ping) the hosts status and just prints a little table of host status
 that can only be avoided by passing a special --yes-i-know flag to the
 wrapper

It has been done, more or less what you are thinking about. The
cluster I'm currently working on is Service Guard on Linux. The 
original platform is HP-UX. They use LVM for mirroring and device
locking is on LVM level.  The active cluster node activates a volume
group in exclusive mode. This writes a kind of flag onto the
disk. Should the node die without a chance to clear the flag, the node
taking over the service knows what happened and forces the take-over
of the volume group.  This feature is missing on Linux.

I already have a Linux cluster which has been running for over one
year w/o problems.  I've just setup three more and to sleep better I'm
looking for a way to diminish chances of a disaster due to a operation
fault. 

Regards,
Chris

 
 
 -Mike
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question: array locking, possible?

2006-02-08 Thread Chris Osicki


I was thinking about it, I have no idea how to do it on Linux if ever possible.
I connect over fibre channel SAN, using QLogic QLA2312 HBAS, if it matters.

Anyone any hints?

Thanks and regards,
Chris

On Tue, 07 Feb 2006 14:26:13 -0500
Paul Clements [EMAIL PROTECTED] wrote:

 Chris Osicki wrote:
  The problem now is how to prevent somebody on the other host from
  accidentally assembling the array. Because the result of doing so would
  be something from strange to catastrophic ;-)
  
  To rephrase my question, is there any way to make it visible to the
  other host that the array is up an running on the this host?
 
 I don't know how the storage boxes are attached to the servers, but you 
 might be able to use SCSI reservations, if the storage supports them.
 
 --
 Paul
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mdadm and monitoring

2006-02-08 Thread Gordon Henderson
On Wed, 8 Feb 2006, discman (sent by Nabble.com) wrote:

 Hi.

 Anyone with some experience on mdadm?

Just about everyone here, I'd hope ;-)

 I have a running RAID0-array with mdadm and it`s using monitor-mode with an
 e-mail address.
 Anyone knows how to remove that e-mail adress without deleting the raid?

It all depends on how you setup mdadm in the first place - or if it was
supplied with your distro.

mdadm will be running in monitor mode with flags something like:

  /sbin/mdadm -F -i /var/run/mdadm.pid -m root -f -s

The -m flag says who gets emailled. If it's started at boot-time then
you'll need to trace through the /etc/init.d/mdadm (probably) file to see
where it gets it's parameters from.

Debian (sarge) puts this email address directly in the /etc/init.d/mdadm
file, and stoping  starting the monitor process won't have any effect on
your array whatsoever.

Gordon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


uPositive

2006-02-08 Thread Charley I. McCarthy
Hey whats up,

We makes it easy to oor der the mea dica itions you need.

Our 0nline Farm eaacy offers a safe way to 0arder ch eeap Nia agra and Liv=
eeitra onlinea The easiest way to beuy these from our site.

Men/Women's health, alleergy, paain, weeight loss and many others



Co py the   Ad dress   below and paste in i your web browser:

cathection.maintaininar.com



v a li d  for 24 hirs.


All of us dragged the kid into the street and Ronnie pressed the kid's=20.=

Lyin' on the sidewalk,.
Form complete is worthier far;=20.
Form complete is worthier far;=20.
Hath been=97most familiar bird=97.

Thank you,

Mac Donne
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID 5 inaccessible

2006-02-08 Thread Krekna Mektek
Hi,

I found out that my storage drive was gone and I went to my server to
check out what wrong.
I've got 3 400GB disks wich form the array.

I found out I had one spare and one faulty drive, and the RAID 5 array
was not able to recover.
After a reboot because of some stuff with Xen my main rootdisk (hda)
was also failing, and the whole machine was not able to boot anymore.
And there I was...
After I tried to commit suicide and did not succeed, I went back to my
server to try something out.
I booted with Knoppix 4.02 and edited the mdadm.conf as follows:

DEVICE /dev/hd[bcd]1
ARRAY /dev/md0 devices=/dev/hdb1,/dev/hdc1,/dev/hdd1


I executed mdrun and the following messages appeared:

Forcing event count in /dev/hdd1(2) from 81190986 upto 88231796
clearing FAULTY flag for device 2 in /dev/md0 for /dev/hdd1
/dev/md0 has been started with 2 drives (out of 3) and 1 spare.

So I thought I was lucky enough, to get back my data, maybe a bit lost
concerning the event count which is missing some. Am I right?

But, when I tried to mount it the next day, this was also not
happening. I ended up with one faulty, one spare and one active. After
stopping and starting the array sometimes the array was rebuilding
again. I found out that the disk that it needs to rebuilt the array
(hdd1 that is) is
getting errors and falls back to faulty again.



Number   Major   Minor   RaidDevice State
   0   3   650  active sync
   1   00-  removed
   2  22   652  active sync

   3  2211  spare rebuilding


and then this:

Rebuild Status : 1% complete

Number   Major   Minor   RaidDevice State
   0   3   650  active sync
   1   00-  removed
   2   00-  removed

   3  2211  spare rebuilding
   4  22   652  faulty

And my dmesg is full of these errors coming from the faulty hdd:
end_request: I/O error, dev hdd, sector 13614775
hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=13615063,
high=0, low=13615063, sector=13614783
ide: failed opcode was: unknown
end_request: I/O error, dev hdd, sector 13614783


I guess this will never succeed...

Is there away to get this data back from the individual disks perhaps?


FYI:


[EMAIL PROTECTED] cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 hdb1[0] hdc1[3] hdd1[4](F)
  781417472 blocks level 5, 64k chunk, algorithm 2 [3/1] [U__]
  []  recovery =  1.7% (6807460/390708736)
finish=3626.9min speed=1764K/sec
unused devices: none

Krekna
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mdraid0 worries on 2.6

2006-02-08 Thread Richard Mittendorfer
Also sprach Richard Mittendorfer [EMAIL PROTECTED] (Tue, 7 Feb 2006
01:12:20 +0100):
 Hi,
 
 I do have _great_ performance troubles with software raid0 with 2.6.12
 to .15. I have not tried earlier 2.6's, but 2.4 (tested up to 2.4.24)
 do it right.
 
 In fact, mdraid0 over 2 rather old 9G scsi disks here in my home
 server show same throughput as a single drive. I did try to tune
 readahead, tune various proc settings, used reiserfs and xfs or mdadm
 with different
 settings -- nothing helped. I'm clueless.

Looks like you're right. First, I didn't messure right: My fault was to
use to small testfiles and so messured RAM also. No wonder I came above
20MB/sec. Here are bonnie++(v1.03) results on xfs and reiserfs over a
raid0 [sda|sdb]. I did no tuning of drive/mount parameters or VM with
2.4.

hdparm -tT
2.4.32 disks:
/dev/sdb  364MB in 2.01 sec =181 MB/sec
   44MB in 3.14 sec =  14.01 MB/sec
/dev/sdc  360MB in 2.01 sec = 179.10 MB/sec
   44MB in 3.12 sec =  14.10 MB/sec
/dev/md4  352MB in 2.01 sec = 175.12 MB/sec
   44MB in 3.02 sec =  14.57 MB/sec

2.6.15 disks: (see above)

2.4.23 mem=256M / bonnie++ -s 512M
reiserfs (no mountoptions):
cell,512M,5686,88,21261,46,7710,10,5508,79,16705,9,342.4,3,
 16,3609,92,+,+++,6894,99,3727,95,+,+++,5054,88
xfs (logbufs=8)
cell,512M,6668,96,23106,25,7807,9,5468,79,16902,9,295.7,3,
 16,1170,57,+,+++,1410,48,1123,54,+,+++,758,36

2.4.23 mem=128M / bonnie++ -s 512M
reiserfs (no mountoptions)
cell,512M,6107,95,19152,41,8097,10,5815,84,16976,10,237.9,2,
 16,5710,94,+,+++,6884,99,5804,97,+,+++,5091,88
xfs (logbufs=8)
cell,512M,6716,97,19943,21,8145,9,5809,83,17167,9,209.9,2,
 16,898,44,+,+++,1523,54,863,44,+,+++,936,38

When testing with RAM*4 files it shows that it (the HBA) can't deliver
more than 20MB/sec. While I saw 25MB/sec (like the 23MB/sec seq.read
above), it just was due to bad messurement with RAM*2 files (which
didn't show up on 2.6 btw.).

2.6.15-cks3 mem=128M / bonnie++ -s 512M
reiserfs (notail,noatime)
cell,512M,9134,98,18200,32,8924,13,10224,99,18583,17,238.1,3,
 16,5369,99,+,+++,6301,95,5197,100,+,+++,5443,100
xfs (logbufs=8)
cell,512M,9432,97,19480,25,8376,13,10914,97,18600,18,208.9,3,
 16,854,38,+,+++,1257,40,830,38,+,+++,735,30

sorry for the noise
 
ritch
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID 106 array

2006-02-08 Thread Bill Davidsen
I was doing some advanced planning for storage, and it came to me that 
there might be benefit from thinking outside the box on RAID config. 
Therefore I offer this for comment.

The object is reliability with performance. The means is to set up two 
arrays on physical devices, one RAID-0 for performance, one RAID-6 for 
reliability. Let's call them four and seven drives for the RAID-0 and 
RAID-6+Spare. Then configure RAID-1 over the two arrays, marking the 
RAID-6 as write-mostly.

The intension is that under heavy load the RAID-6 would have a lot of head 
motion going on writing two parity blocks for every one data write, while 
the RAID-0 would be doing as little work as possible for writes and would 
therefore have more ability to handle reads quickly.

Just a thought experiment at the moment.

-- 
bill davidsen [EMAIL PROTECTED]
  CTO TMR Associates, Inc
Doing interesting things with little computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question: array locking, possible?

2006-02-08 Thread Jure Pečar
On Wed, 8 Feb 2006 11:55:49 +0100
Chris Osicki [EMAIL PROTECTED] wrote:

 
 
 I was thinking about it, I have no idea how to do it on Linux if ever 
 possible.
 I connect over fibre channel SAN, using QLogic QLA2312 HBAS, if it matters.
 
 Anyone any hints?

I too am running a jbod with md raid between two machines. So far md never
caused any kind of problems, altough I did have situations where both
machines were syncing mirrors at once.

If there's a little tool to reserve a disk via scsi, I'd like to know about
it too. Even a piece of code would be enough.


-- 

Jure Pečar
http://jure.pecar.org/
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-02-08 Thread Neil Brown
On Tuesday February 7, [EMAIL PROTECTED] wrote:
 Hello linux world!
 
 Excuse me for being so ignorant but /exactly how/ do I go about to find
 out which files to download from kernel.org that will approve these
 patches?

I always make them against the latest -mm kernel, so that would be a
good place to start.  However things change quickly and I can't
promise it will apply against whatever is the 'latest' today.

If you would like to nominate a particular recent kernel, I'll create
a patch set that is guaranteed to apply against that. (Testing is
always appreciated, and well worth that small effort on my part).

NeilBrown

 
 [snip from N. Browns initial post]
 
 
  [PATCH 001 of 7] md: Split disks array out of raid5 conf structure so it is 
  easier to grow.
  [PATCH 002 of 7] md: Allow stripes to be expanded in preparation for 
  expanding an array.
  [PATCH 003 of 7] md: Infrastructure to allow normal IO to continue while 
  array is expanding.
  [PATCH 004 of 7] md: Core of raid5 resize process
  [PATCH 005 of 7] md: Final stages of raid5 expand code.
  [PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape
  [PATCH 007 of 7] md: Only checkpoint expansion progress occasionally.
 
 
 I only get lot's of chunk failed when running patch command on my
 src.tar.gz kernels. :-(
 
 Thanks for advice,
 
 Henrik Holst. Certified kernel patch noob.
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kernels and MD versions (was: md: Introduction - raid5 reshape mark-2)

2006-02-08 Thread Patrik Jonsson
Neil Brown wrote:
 I always make them against the latest -mm kernel, so that would be a
 good place to start.  However things change quickly and I can't
 promise it will apply against whatever is the 'latest' today.
 
 If you would like to nominate a particular recent kernel, I'll create
 a patch set that is guaranteed to apply against that. (Testing is
 always appreciated, and well worth that small effort on my part).

I find this is a major problem for me, too. Even though I try to stay up
to date with the md developments, I have a hard time piecing together
which past patches went with which version, so if I get a recent version
I don't know which patches need to be applied and which are already in
there.

My suggestion is that Neil, should he be willing, keep a log somewhere
which detail kernel version and what major updates in md functionality
go along with it. Something like
2.6.14 raid5 read error correction
2.6.15 md /sys interface
2.6.16.rc1 raid5 reshape
2.6.16.rc2-mm4 something else cool

it would include the released kernels and the previews of the current
kernel. That way, say I see that the latest FC4 kernel is 2.6.14, I
could look and see that since the raid5 read error correction was
included, I don't have to go looking for the patches.

Maybe this is too much hassle (or maybe it's already out there
somewhere) but I'm thinking simple, and I think it would give a
high-level overview of the development for us who are not intimately
involved in every kernel version.

Regards,

/Patrik


signature.asc
Description: OpenPGP digital signature