Re: [PATCH 009 of 11] md: Support stripe/offset mode in raid10

2006-05-02 Thread Al Boldi
Neil Brown wrote:
> On Tuesday May 2, [EMAIL PROTECTED] wrote:
> > NeilBrown wrote:
> > > The "industry standard" DDF format allows for a stripe/offset layout
> > > where data is duplicated on different stripes. e.g.
> > >
> > >   A  B  C  D
> > >   D  A  B  C
> > >   E  F  G  H
> > >   H  E  F  G
> > >
> > > (columns are drives, rows are stripes, LETTERS are chunks of data).
> >
> > Presumably, this is the case for --layout=f2 ?
>
> Almost.  mdadm doesn't support this layout yet.
> 'f2' is a similar layout, but the offset stripes are a lot further
> down the drives.
> It will possibly be called 'o2' or 'offset2'.
>
> > If so, would --layout=f4 result in a 4-mirror/striped array?
>
> o4 on a 4 drive array would be
>
>A  B  C  D
>D  A  B  C
>C  D  A  B
>B  C  D  A
>E  F  G  H
>

Yes, so would this give us 4 physically duplicate mirrors?
If not, would it be possible to add a far-offset mode to yield such a layout?

> > Also, would it be possible to have a staged write-back mechanism across
> > multiple stripes?
>
> What exactly would that mean?

Write the first stripe, then write subsequent duplicate stripes based on idle 
with a max delay for each delayed stripe.

> And what would be the advantage?

Faster burst writes, probably.

Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 009 of 11] md: Support stripe/offset mode in raid10

2006-05-02 Thread Neil Brown
On Tuesday May 2, [EMAIL PROTECTED] wrote:
> NeilBrown wrote:
> > The "industry standard" DDF format allows for a stripe/offset layout
> > where data is duplicated on different stripes. e.g.
> >
> >   A  B  C  D
> >   D  A  B  C
> >   E  F  G  H
> >   H  E  F  G
> >
> > (columns are drives, rows are stripes, LETTERS are chunks of data).
> 
> Presumably, this is the case for --layout=f2 ?

Almost.  mdadm doesn't support this layout yet.  
'f2' is a similar layout, but the offset stripes are a lot further
down the drives.
It will possibly be called 'o2' or 'offset2'.

> If so, would --layout=f4 result in a 4-mirror/striped array?

o4 on a 4 drive array would be 

   A  B  C  D
   D  A  B  C
   C  D  A  B
   B  C  D  A
   E  F  G  H
   

> 
> Also, would it be possible to have a staged write-back mechanism across 
> multiple stripes?

What exactly would that mean?  And what would be the advantage?

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1: can't remove (or set-faulty) a disk during resync with mdadm

2006-05-02 Thread Gil
David Mansfield wrote:
> When I run mdadm --manage -f /dev/md1 /dev/hdb2 it only causes the
> resync to start again from the beginning, it doesn't actually mark it bad.

For grins, does mdadm --manage /dev/md1 -f /dev/hdb2 behave
differently?  Or just mdadm /dev/md1 -f /dev/hdb2?

I ran basically the last one on a CentOS 4.3 box not more than a
week ago and it was fine.

--Gil
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID1: can't remove (or set-faulty) a disk during resync with mdadm

2006-05-02 Thread David Mansfield

Hi,

I'm running on Centos 4.3 with the latest kernel so perhaps this is a 
'vendor uses old/modified kernel' problem, (kernel is 2.6.9-34.EL) but 
anyway here goes:


I have a degraded mirror.  The rebuild is proceeding with /dev/hda1 
'good' and /dev/hdb1 'syncing'.


I'd like to pull /dev/hdb1 out of the raid and go back to 'degraded' 
mode with no resync.


When I run mdadm --manage -f /dev/md1 /dev/hdb2 it only causes the 
resync to start again from the beginning, it doesn't actually mark it bad.


The same thing happens if there's a write error to /dev/hdb1 during 
resync, instead of failing, it simply restarts the resync.


I imagine the two are related - maybe 'set faulty' simply simulates an 
i/o error on the member, but during resync, the behavior is 'retry'.


Is there anything that can be done about this (other than politely ask 
vendor for a fix ;-)?


David
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 009 of 11] md: Support stripe/offset mode in raid10

2006-05-02 Thread Al Boldi
NeilBrown wrote:
> The "industry standard" DDF format allows for a stripe/offset layout
> where data is duplicated on different stripes. e.g.
>
>   A  B  C  D  
>   D  A  B  C  
>   E  F  G  H  
>   H  E  F  G  
>
> (columns are drives, rows are stripes, LETTERS are chunks of data).

Presumably, this is the case for --layout=f2 ?
If so, would --layout=f4 result in a 4-mirror/striped array?

Also, would it be possible to have a staged write-back mechanism across 
multiple stripes?

Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] stripe_to_pdidx() cleanup

2006-05-02 Thread Coywolf Qi Hunt
Hello,

Cleanup: Remove unnecessary variable x in stripe_to_pdidx().

Signed-off-by: Coywolf Qi Hunt <[EMAIL PROTECTED]>
---

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 3184360..7df6840 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1033,13 +1033,12 @@ static void end_reshape(raid5_conf_t *co
 
 static int stripe_to_pdidx(sector_t stripe, raid5_conf_t *conf, int disks)
 {
-   int sectors_per_chunk = conf->chunk_size >> 9;
-   sector_t x = stripe;
int pd_idx, dd_idx;
-   int chunk_offset = sector_div(x, sectors_per_chunk);
-   stripe = x;
-   raid5_compute_sector(stripe*(disks-1)*sectors_per_chunk
-+ chunk_offset, disks, disks-1, &dd_idx, &pd_idx, 
conf);
+   int sectors_per_chunk = conf->chunk_size >> 9;
+   int chunk_offset = sector_div(stripe, sectors_per_chunk);
+
+   raid5_compute_sector(stripe*(disks-1)*sectors_per_chunk + chunk_offset,
+   disks, disks-1, &dd_idx, &pd_idx, conf);
return pd_idx;
 }
 

-- 
Coywolf Qi Hunt
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md: Change ENOTSUPP to EOPNOTSUPP

2006-05-02 Thread Ric Wheeler



Paul Clements wrote:



You'll see something like this in your system log if barriers are not 
supported:


Apr  3 16:44:01 adam kernel: JBD: barrier-based sync failed on md0 - 
disabling barriers



Otherwise, assume that they are. But like Neil said, it shouldn't 
matter to a user whether they are supported or not. Filesystems will 
work correctly either way.


--
Paul

File systems will work correctly, but if you are running without 
barriers and with your write cache enabled, you are running the risk of 
data loss or file system corruption on any power loss.


It is an issue of concern since drive companies ship the write cache on 
by default. When we detect a non-supported drive (queuing enabled, lack 
of support for the barrier low level mechanism) we disable future 
barrier request and leave the write cache enabled.


I guess that you could argue that this is what most home users want 
(i.e., best performance at the cost of some possible data loss on power 
outage since most people lose power rarely these days), but it is not 
good enough for critical data storage.


I would suggest that if you see this message on ext3 (or the reiserfs 
message for reiser users), you should run with your write cache disabled 
by default or disable queuing (which is often the reason the barrier ops 
get disabled).


ric

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html