Iustin Pop wrote:
On Sun, Aug 12, 2007 at 07:03:44PM +0200, Jan Engelhardt wrote:
On Aug 12 2007 09:39, [EMAIL PROTECTED] wrote:
now, I am not an expert on either option, but three are a couple things that I
would question about the DRDB+MD option
1. when the remote machine is down, how does
Mike Snitzer wrote:
Here are the steps to reproduce reliably on SLES10 SP1:
1) establish a raid1 mirror (md0) using one local member (sdc1) and
one remote member (nbd0)
2) power off the remote machine, whereby severing nbd0's connection
3) perform IO to the filesystem that is on the md0 device
Shaya Potter wrote:
[please cc: me on responses]
if one read Documentation/fs/ntfs.txt in the linux kernel, it talks
about accessing windows raid volumes in Linux
If one is using non raid-5, one can use the device mapper, but if one is
using raid-5, one has to use the md driver.
However, it
Neil,
The write behind value does not get converted to/from little endian
which causes write behind not to work on big endian machines (RUN ARRAY
fails with invalid argument error):
# mdadm -B /dev/md0 -l1 -n2 --write-behind=256 /dev/sdb10 --write-mostly
/dev/nbd0 --bitmap=/bitmaps/test5
Paul Clements wrote:
--- mdadm-2.5.1/bitmap.c.orig Fri Oct 6 13:40:35 2006
+++ mdadm-2.5.1/bitmap.cFri Oct 6 13:40:53 2006
@@ -33,6 +33,7 @@ inline void sb_le_to_cpu(bitmap_super_t
sb-chunksize = __le32_to_cpu(sb-chunksize);
sb-daemon_sleep = __le32_to_cpu(sb
Michael Tokarev wrote:
Neil Brown wrote:
ffs is closer, but takes an 'int' and we have a 'unsigned long'.
So use ffz(~X) to convert a chunksize into a chunkshift.
So we don't use ffs(int) for an unsigned value because of int vs
unsigned int, but we use ffz() with negated UNSIGNED. Looks
Neil,
We just discovered this problem on a 64-bit IBM POWER (ppc64) system.
The symptom was this BUG():
Sep 20 20:55:51 caspian kernel: kernel BUG in sync_request at
drivers/md/raid1.c
:1743!
Sep 20 20:55:51 caspian kernel: Oops: Exception in kernel mode, sig: 5 [#1]
Sep 20 20:55:51 caspian
Tomasz Chmielewski wrote:
[42949428.59] md: md11: raid array is not clean -- starting
background reconstruction
[42949428.62] raid10: raid set md11 active with 4 out of 4 devices
[42949428.63] md: syncing RAID array md11
[42949428.63] md: minimum _guaranteed_ reconstruction
Description:
---
When a filesystem sends a barrier write down to raid1, raid1 tries to
pass the write down to its component devices. However, if one or more of
the devices return EOPNOTSUPP, it means that they do not support
barriers, and raid1 must strip the barrier out of the write
Neil Brown wrote:
sector, bdev, size are all remembered in r1_bio.
That leaves bi_idx and an array od len/offset pairs that we need
to preserve.
So I guess the first step is to change alloc_behind_pages to
return a new 'struct bio_vec' array rather than just a list of pages,
and we should keep
Paul Clements wrote:
I think bio_clone gives us that already. I may have missed something but
I think we have everything we need:
When a bio comes into raid1's make_request we bio_clone for each drive
and attach those to r1_bio-bios. We also have behind_pages, which
contains the pages. I
Neil Brown wrote:
Is 'bitmap' the best name for the sysfs file?
It seems a bit generic to me.
write-bits-here-to-dirty-them-in-the-bitmap
is probably (no, definitely) too verbose.
dirty-in-bitmap
maybe?
bitmap-set-bits
Any better suggestions?
I like bitmap-set-bits or bitmap-dirty
Mike Snitzer wrote:
On 7/26/06, Paul Clements [EMAIL PROTECTED] wrote:
Mike Snitzer wrote:
Also, what is the interface one should use to collect dirty bits from
the primary's bitmap?
Whatever you'd like. scp the bitmap file over or collect the ranges into
a file and scp that over
Mike Snitzer wrote:
On 7/26/06, Paul Clements [EMAIL PROTECTED] wrote:
Right. At the time of the failover, there were (probably) blocks that
were out of sync between the primary and secondary.
OK, so now that I understand the need to merge the bitmaps... the
various scenarios that create
Mike Snitzer wrote:
I tracked down the thread you referenced and these posts (by you)
seems to summarize things well:
http://marc.theaimsgroup.com/?l=linux-raidm=16563016418w=2
http://marc.theaimsgroup.com/?l=linux-raidm=17515400864w=2
But for clarity's sake, could you elaborate on the
This patch (tested against 2.6.18-rc1-mm1) adds a new sysfs interface
that allows the bitmap of an array to be dirtied. The interface is
write-only, and is used as follows:
echo 1000 /sys/block/md2/md/bitmap
(dirty the bit for chunk 1000 [offset 0] in the in-memory and on-disk
bitmaps of
Janos Farkas wrote:
# for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map
Bitmap : 285923 bits (chunks), 0 dirty (0.0%)
Bitmap : 285923 bits (chunks), 0 dirty (0.0%)
Bitmap : 285923 bits (chunks), 65536 dirty (22.9%)
This indicates that the _on-disk_ bits are
David Greaves wrote:
How do I interpret:
bitmap: 0/117 pages [0KB], 1024KB chunk
in the mdstat output
what does it mean when it's, eg: 23/117
This refers to the in-memory bitmap (basically a cache of what's in the
on-disk bitmap -- it allows bitmap operations to be more efficient).
Bill Davidsen wrote:
Paul Clements wrote:
Neil Brown wrote:
I am pleased to announce the availability of
mdadm version 2.5.1
and here's another patch for a compile error on ppc...
Since ppc is big endian, the compiler is complaining because it can't
determine whether the isuper post
Paul Clements wrote:
--- super1.c2006-06-19 05:17:36.0 -0400
+++ /export/public/clemep/tmp/super1-ppc-compile-error.c2006-06-19
00:40:26.0 -0400
@@ -124,8 +124,11 @@ static unsigned int calc_sb_1_csum(struc
disk_csum = sb-sb_csum;
sb-sb_csum = 0
Andrew Morton wrote:
The loss of pagecache coherency seems sad. I assume there's never a
requirement for userspace to read this file.
Actually, there is. mdadm reads the bitmap file, so that would be
broken. Also, it's just useful for a user to be able to read the bitmap
(od -x, or
Mike Garey wrote:
I seem to be getting closer.. If I try booting from a kernel without
raid1 and md support, but using an initrd with raid1/md modules, then
I get the ALERT! /dev/md0 does not exist. Dropping to a shell!
message. I can't understand why there would be any difference between
Neil Brown wrote:
On Friday March 31, [EMAIL PROTECTED] wrote:
I've been looking at the mdadm monitor, and thought it might be useful
if it allowed extra context information (in the form of command line
arguments) to be sent to the event program, so instead of just:
# mdadm -F /dev/md0 -p
Signed-Off-By: Paul Clements [EMAIL PROTECTED]
Monitor.c | 23 ++-
1 files changed, 22 insertions(+), 1 deletion(-)
--- mdadm-2.4/Monitor.c 2006-03-28 17:59:42.0 -0500
+++ mdadm-2.4-event-args/Monitor.c 2006-03-31 13:19:40.0 -0500
@@ -464,6 +464,27
The following patch makes it possible to tag a device as write-mostly on
--add and --re-add with a non-persistent superblock array. Previously,
this was not working.
Thanks,
Paul
Signed-Off-By: Paul Clements [EMAIL PROTECTED]
Manage.c |2 ++
1 files changed, 2 insertions(+)
--- mdadm
, going back to 2003.
--
Paul
-Original Message-
From: Paul Clements [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 22, 2006 11:37 PM
To: Yogesh Pahilwan
Cc: 'Neil Brown'; linux-raid@vger.kernel.org
Subject: Re: Linux MD RAID5/6 bitmap patches
Yogesh Pahilwan wrote:
Thanks
Jure Pečar wrote:
I too am running a jbod with md raid between two machines. So far md never
caused any kind of problems, altough I did have situations where both
machines were syncing mirrors at once.
If there's a little tool to reserve a disk via scsi, I'd like to know about
it too. Even a
Mirko Benz wrote:
Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI
card that exports battery backed memory.
Sure. There are a couple ways I can think of using such a thing:
1) put an md intent bitmap on the NVRAM device for faster resyncs
2) use the NVRAM as a write
Chris Osicki wrote:
The problem now is how to prevent somebody on the other host from
accidentally assembling the array. Because the result of doing so would
be something from strange to catastrophic ;-)
To rephrase my question, is there any way to make it visible to the
other host that the
Hi Neil,
Glad to see this patch is making its way to mainline. I have a couple of
questions on the patch, though...
NeilBrown wrote:
+ if (uptodate || conf-working_disks = 1) {
Is it valid to mask a read error just because we have only 1 working disk?
+
Bill Davidsen wrote:
One of the advantages of mirroring is that if there is heavy read load
when one drive is busy there is another copy of the data on the other
drive(s). But doing 1MB reads on the mirrored device did not show that
the kernel took advantage of this in any way. In fact, it
Carlos Carvalho wrote:
I think the demand for any solution to the unclean array is indeed low
because of the small probability of a double failure. Those that want
more reliability can use a spare drive that resyncs automatically or
raid6 (or both).
A spare disk would help, but note that
Jeff Breidenbach wrote:
your suggestion about kernel 2.6.13 and intent logging and
having mdadm pull a disk sounds like a winner. I'm going to to try it
if the software looks mature enough. Should I be scared?
There have been a couple bug fixes in the bitmap stuff since 2.6.13 was
released,
Al Boldi wrote:
NeilBrown wrote:
If a device is flagged 'WriteMostly' and the array has a bitmap,
and the bitmap superblock indicates that write_behind is allowed,
then write_behind is enabled for WriteMostly devices.
Nice, but why is it dependent on WriteMostly?
WriteMostly is just a
Jeff Breidenbach wrote:
Does chunk size matter *at all* for RAID-1?
mdadm --create /dev/md0 --level=1 --chunk=8 /dev/sdc1 /dev/sdd1
mdadm --create /dev/md0 --level=1 --chunk=128 /dev/sdc1 /dev/sdd1
In my mental model of how RAID works, it can't possibly matter
what my chunk size is whether
Gregory Seidman wrote:
It turns out that if one uses the kernel (2.4.x-2.6.x) RAID support (RAID5,
anyway, since that's all I've tested), the RAID'd disks cannot be moved to
another system with a different endianness. I don't know how hard that
Unfortunately, I haven't gotten into kernel
Gregory Seidman wrote:
1) Does this mean that the fix will be in 2.6.13?
Yes, version 1 superblock support is in 2.6.13. You need the latest
mdadm as well.
2) Does the version 1 you refer to have to do with pre-2.6 RAID support?
No. 2.4 kernels do not understand version 1 superblocks.
Farkas Levente wrote:
anybody has any useful tip about it?
Unable to handle kernel NULL pointer dereference at virtual address
printing eip:
*pde = 0f94a067
Oops: [#1]
Modules linked in: cifs nls_utf8 ncpfs nfsd exportfs lockd sunrpc parport_pc lp
parport netconsole
Dave Jiang wrote:
I'm attempting to implement multihost support of the MD for environments
such as carrier grade Linux. Multihost support allows the RAID array to
be claimed by a particular host via a unique ID (unique SCSI host ID,
FibreChannel WWN, or geographical address of a chassis blade.
Peter T. Breuer wrote:
Paul Clements [EMAIL PROTECTED] wrote:
Peter T. Breuer wrote:
I don't see that this solves anything. If you had both sides going at
once, receiving different writes, then you are sc**ed, and no
resolution of bitmaps will help you, since both sides have received
different
Luca Berra wrote:
On Mon, Mar 21, 2005 at 11:07:06AM -0500, Paul Clements wrote:
All I'm saying is that in a split-brain scenario, typical cluster
frameworks will make two (or more) systems active at the same time. This
I sincerely hope not.
Perhaps my choice of wording was not the best? I
Peter T. Breuer wrote:
Paul Clements [EMAIL PROTECTED] wrote:
At any rate, this is all irrelevant given the second part of that email
reply that I gave. You still have to do the bitmap combining, regardless
of whether two systems were active at the same time or not.
But why don't we already
Peter T. Breuer wrote:
Lars Marowsky-Bree [EMAIL PROTECTED] wrote:
On 2005-03-18T13:52:54, Peter T. Breuer [EMAIL PROTECTED] wrote:
The problem is for multi-nodes, both sides have their own bitmap. When a
split scenario occurs,
Here I think you mean that both nodes go their independent ways, due
Peter T. Breuer wrote:
Paul Clements [EMAIL PROTECTED] wrote:
I don't see that this solves anything. If you had both sides going at
once, receiving different writes, then you are sc**ed, and no
resolution of bitmaps will help you, since both sides have received
different (legitimate) data
This patch enables the async write capability in md. It requires the md
bitmap patches and the 117-WriteMostly-update patch.
Signed-Off-By: Paul Clements [EMAIL PROTECTED]
drivers/md/bitmap.c | 26 ++
drivers/md/md.c | 13 +
include
This patch implements async writes in raid1.
Signed-Off-By: Paul Clements [EMAIL PROTECTED]
drivers/md/raid1.c | 107 ++---
include/linux/raid/raid1.h |2
2 files changed, 102 insertions(+), 7 deletions(-)
diff -purN --exclude core --exclude
Neil Brown wrote:
On Wednesday March 9, [EMAIL PROTECTED] wrote:
avoid setting of sb-events_lo = 1 when creating a 0.90 superblock -- it
doesn't seem to be necessary and it was causing the event counters to
start at 4 billion+ (events_lo is actually the high part of the events
counter, on
, as in the kernel
if'ed out super1 definition which is now in the kernel headers
included sys/time.h to avoid compile error
Thanks,
Paul
Signed-Off-By: Paul Clements [EMAIL PROTECTED]
bitmap.c | 58 +-
raid1.c |1 +
2 files changed, 34
Here's the mdadm patch...
Paul Clements wrote:
Neil,
here are a couple of patches -- this one for the kernel, the next for
mdadm. They fix a few issues that I found while testing the new bitmap
intent logging code.
Briefly, the issues were:
kernel:
added call to bitmap_daemon_work() from raid1d
Peter T. Breuer wrote:
Neil - can you describe for me (us all?) what is meant by
intentlogging here.
Since I wrote a lot of the code, I guess I'll try...
Well, I can guess - I suppose the driver marks the bitmap before a write
(or group of writes) and unmarks it when they have completed
Colin McDonald wrote:
Without taking up to much of y'alls time, what would be the best
solution for moving the RAID array over to a new box?
2. Try to boot off of the disks after they have been transferred into
the new machine? I know this will cause all kinds of problems with
kernel/devices, etc
Mario Holbe wrote:
I'm running Linux 2.4.27 i686 single-processor from debian's
kernel-source-2.4.27 and mdadm 1.9.0 in monitor mode:
While stopping a raid1 (raidstop /dev/md8) it seems there
Unable to handle kernel NULL pointer dereference at virtual address 03d8
c024be53
*pde =
Hi Robert,
Robert Heinzmann wrote:
can someone verify if the following statements are true ?
- It's not possible to simply convert a existing partition with a
filesystem on it to a raid1 mirror set.
If you create a raid1 without a superblock it is possible (although not
a very common
53 matches
Mail list logo