Propose of enhancement of raid1 driver

2006-10-16 Thread Miroslaw Mieszczak

I would like to propose an enhancement of raid 1 driver in linux kernel.
The enhancement would be speedup of data reading on mirrored partitions.
The idea is easy.
If we have mirrored partition over 2 disks, and these disk are in sync, there is
possibility of simultaneous reading of the data from both disks on the same way
as in raid 0. So it would be chunk1 read from master, chunk2 read from slave at
the same time. 
As result it would give significant speedup of read operation (comparable with

speed of raid 0 disks).

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: libata hotplug and md raid?

2006-10-16 Thread Neil Brown
On Monday October 16, [EMAIL PROTECTED] wrote:
> > So the question remains: How will hotplug and md work together?
> > 
> > How does md and hotplug work together for current hotplug devices?
> 
> I have the same questions.
> 
> How does this work in a pure SCSI environment? (has it been tested?)
> If something should change, should those changes be in the MD layer?
> Or can this *really* all be done nicely from userspace?  How?

I would imagine that device removal would work like this:
 1/  you unplug the device
 2/ kernel notices and generates an unplug event to udev.
 3/ Udev does all the work to try to disconnect the device:
 force unmount (though that doesn't work for most filesystems)
 remove from dm
 remove from md (mdadm /dev/mdwhatever --fail /dev/dead --remove /dev/dead)
 4/ Udev removes the node from /dev.

udev can find out what needs to be done by looking at
/sys/block/whatever/holders. 

I don't know exactly how to get udev to do this, or whether there
would be 'issues' in getting it to work reliably.  However if anyone
wants to try I'm happy to help out where I can.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: libata hotplug and md raid?

2006-10-16 Thread Mark Lord

Leon Woestenberg wrote:

Hello all,

On 9/13/06, Tejun Heo <[EMAIL PROTECTED]> wrote:

Ric Wheeler wrote:
> Leon Woestenberg wrote:
>> In short, I use ext3 over /dev/md0 over 4 SATA drives /dev/sd[a-d]
>> each driven by libata ahci. I unplug then replug the drive that is
>> rebuilding in RAID-5.
>>
>> When I unplug a drive, /dev/sda is removed, hotplug seems to work to
>> the point where proc/mdstat shows the drive failed, but not removed.

Yeap, that sounds about right.

>> Every other notion of the drive (in kernel and udev /dev namespace)
>> seems to be gone after unplugging. I cannot manually removed the drive
>> using mdadm, because it tells me the drive does not exist.

I see.  That's a problem.  Can you use /dev/.static/dev/sda instead?  If
you can't find those static nodes, just create one w/ 'mknod
my-static-sda b 8 0' and use it.



I did further testing of the ideas set out in this thread.

Although I can use (1) static device nodes, or (2) persistent naming
with the proper udev rules, each has its own kind of problems with md.

As long as the kernel announces drives as disappeared but md still
holds a lock, replugging drives will map to other major:minor number
no matter what I try in userspace.

Static device nodes will therefore not help me select the drive that
was unplugged/plugged per se.

Persistent naming using udev works OK (I used /dev/bay0 through
/dev/bay3 to pinpoint the drive bays) but these disappear upon
unplugging, while md keeps a lock to the major:minor, so replugging
will move it to different major:minor numbers.

So the question remains: How will hotplug and md work together?

How does md and hotplug work together for current hotplug devices?


I have the same questions.

How does this work in a pure SCSI environment? (has it been tested?)
If something should change, should those changes be in the MD layer?
Or can this *really* all be done nicely from userspace?  How?

I've got to fix some problems related to this, for a couple of clients,
and would like to Do It Right, or as close to Right as reality permits.

Cheers
--
Mark Lord
Real-Time Remedies Inc.
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.

2006-10-16 Thread Neil Brown
On Thursday October 12, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> []
> > Fix count of degraded drives in raid10.
> > 
> > Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
> > 
> > --- .prev/drivers/md/raid10.c   2006-10-09 14:18:00.0 +1000
> > +++ ./drivers/md/raid10.c   2006-10-05 20:10:07.0 +1000
> > @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
> > disk = conf->mirrors + i;
> >  
> > if (!disk->rdev ||
> > -   !test_bit(In_sync, &rdev->flags)) {
> > +   !test_bit(In_sync, &disk->rdev->flags)) {
> > disk->head_position = 0;
> > mddev->degraded++;
> > }
> 
> Neil, this makes me nervous.  Seriously.

Yes.  Bugs are a problem.

> 
> How many bugs like this has been fixed so far? 10? 50?  I stopped counting
> long time ago.  And it's the same thing in every case - misuse of rdev vs
> disk->rdev.  The same pattern.

I really don't think there have been that many that follow that
pattern that closely. Maybe 2 or 3.

> 
> I wonder if it can be avoided in the first place somehow - maybe don't
> declare and use local variable `rdev' (not by name, but by the semantics
> of it), and always use disk->rdev or mddev->whatever in every place,
> explicitly, and let the compiler optimize the deref if possible?
> 

There certainly are styles of programming and rules for choosing names
that can help reduce bugs.  And the kernel style does encourage some
good practices.
But that won't be enough by itself.  We need good style, and a review
process, and testing.  And still bugs will get through, but there
should be fewer.  You are welcome to help with any of these.

I hope to set up a more structured testing system soon with should be
able to catch this sort of bug at least.

> And btw, this is another 2.6.18.1 candidate (if it's not too late already).

Yes, it was too late for 2.6.18.1.  I'll submit it for 2.6.18.2.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html