Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Peter Rabbitson
Tomasz Chmielewski wrote:
 I have a RAID-10 setup of four 400 GB HDDs. As the data grows by several
 GBs a day, I want to migrate it somehow to RAID-5 on separate disks in a
 separate machine.
 
 Which would be easy, if I didn't have to do it online, without stopping
 any services.
 
 

Your /dev/md10 - what is directly on top of it? LVM? XFS? EXT3?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

Peter Rabbitson schrieb:

Tomasz Chmielewski wrote:

I have a RAID-10 setup of four 400 GB HDDs. As the data grows by several
GBs a day, I want to migrate it somehow to RAID-5 on separate disks in a
separate machine.

Which would be easy, if I didn't have to do it online, without stopping
any services.




Your /dev/md10 - what is directly on top of it? LVM? XFS? EXT3?


Good point. I don't want to copy the whole RAID-10.
I want to copy only one LVM-2 volume (which is like 90% of that RAID-10, 
anyway).



So I want to synchronize /dev/LVM2/my-volume (ext3) with /dev/sdr (now 
empty; bigger than /dev/LVM2/my-volume).



(sda2, sdb2, sdc2, sdd2) - RAID-10 - LVM-2 - my volume - ext3


--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Gordon Henderson

On Tue, 15 May 2007, Tomasz Chmielewski wrote:

I have a RAID-10 setup of four 400 GB HDDs. As the data grows by several GBs 
a day, I want to migrate it somehow to RAID-5 on separate disks in a separate 
machine.


Which would be easy, if I didn't have to do it online, without stopping any 
services.



M1 - machine 1, RAID-10
M2 - machine 2, RAID-5


My first idea was to copy the data with rsync two or three times (because the 
files change, I would stop the services for the last run) - which turned out 
to be totally unrealistic - I started rsync process two days ago, and it 
still calculates files to be copied (over 100 million files, with hardlinks 
etc.).


I have recent experience of copying up to 1TB parititions to offsite 
backup servers via rsync (and a 10Mb line), and it can be done.


You do need a recent version of rsync and a lot of memory in both servers. 
I'm really surprised it takes days to do this on your server - although 
maybe the source server really is busy? I was doing in over about 2-3 
hours on the servers I was working with. (Several smaller ones backing 
up to one 6TB box) The first copy is always the longest, but if after 2 
days it's not even started to copy the files, then it's gonig to be slow. 
Lots of hardlinks will really slow things down though (and cause memory to 
grow), as it has to find out each filename linked to each file. I've not 
looked too closely into it, but I suspect it's as efficient as a bubble 
sort due to what it has to do. Lots of memory will help if it can keep 
everything cached as well as letting rsync build up it's data without 
swapping.



Ideas how to synchronize the contents of two devices (device1 - device2)?


Look into the speed of the processors too - rsync will use ssh by default 
- which will incur a CPU overheard to do the encryption. If encryption 
isn't an issue, then look into using rsh rather than ssh. (Although making 
a modern system let you login as root with rsh without a password is 
sometimes challenging :)


Another solution might be to use tar, but this is good for snapshotting 
rather than applying incremental updates. something like


   tar cf - /bigVolume | rsh remoteServer -c 'cd target ; tar xfB -'

or something. (it's years since I've done it this way, so check!) You 
might want to set blocking factors which might improve network throughput. 
(And you never know, a modern tar might be able to do incremental changes)


Gordon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread David Greaves
Tomasz Chmielewski wrote:
 Peter Rabbitson schrieb:
 Tomasz Chmielewski wrote:
 I have a RAID-10 setup of four 400 GB HDDs. As the data grows by several
 GBs a day, I want to migrate it somehow to RAID-5 on separate disks in a
 separate machine.

 Which would be easy, if I didn't have to do it online, without stopping
 any services.



 Your /dev/md10 - what is directly on top of it? LVM? XFS? EXT3?
 
 Good point. I don't want to copy the whole RAID-10.
 I want to copy only one LVM-2 volume (which is like 90% of that RAID-10,
 anyway).
 
 
 So I want to synchronize /dev/LVM2/my-volume (ext3) with /dev/sdr (now
 empty; bigger than /dev/LVM2/my-volume).
 
 
 (sda2, sdb2, sdc2, sdd2) - RAID-10 - LVM-2 - my volume - ext3
 
 


I've not used iSCSI but I wonder about using nbd : network block device

Use nbd to export /dev/md5 from machine 2.
Import /dev/nbd0 on machine 1.
Add nbd0 to the VG on machine 1
pvmove the data from /dev/md10 to /dev/nbd0 (ie the md5 on machine2 via nbd)
remove /dev/md10 from the VG.
The VG should now exist only on /dev/nbd0 on machine 2
stop the services and lvm on machine 1
start the lvm and services on machine 2.

I'd suggest testing this first grin.

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

Gordon Henderson schrieb:

On Tue, 15 May 2007, Tomasz Chmielewski wrote:

I have a RAID-10 setup of four 400 GB HDDs. As the data grows by 
several GBs a day, I want to migrate it somehow to RAID-5 on separate 
disks in a separate machine.


Which would be easy, if I didn't have to do it online, without 
stopping any services.



M1 - machine 1, RAID-10
M2 - machine 2, RAID-5


My first idea was to copy the data with rsync two or three times 
(because the files change, I would stop the services for the last run) 
- which turned out to be totally unrealistic - I started rsync process 
two days ago, and it still calculates files to be copied (over 100 
million files, with hardlinks etc.).


I have recent experience of copying up to 1TB parititions to offsite 
backup servers via rsync (and a 10Mb line), and it can be done.


You do need a recent version of rsync and a lot of memory in both 
servers. I'm really surprised it takes days to do this on your server - 
although maybe the source server really is busy?


Yes, the server is really busy and reads/writes a lot - that's one. Two, 
it's being done over iSCSI in a pretty complicated scenario:



100 Mbit
M1 (RAID-10 storage) -- very busy virtual machine in Xen
   /
  / 1 Gbit
 /
 machine with rsync process -- M2 (RAID-5 storage)
   1 Gbit

So there is a slow point of just 100 Mbit to access the storage, which 
is normally fully filled anyway.


rsync, even 3.0, just doesn't fit here - it has to be done in sort of a 
mirror way.



--
Tomasz Chmielewski
htp://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

David Greaves schrieb:

(...)


So I want to synchronize /dev/LVM2/my-volume (ext3) with /dev/sdr (now
empty; bigger than /dev/LVM2/my-volume).


(sda2, sdb2, sdc2, sdd2) - RAID-10 - LVM-2 - my volume - ext3





I've not used iSCSI but I wonder about using nbd : network block device

Use nbd to export /dev/md5 from machine 2.
Import /dev/nbd0 on machine 1.
Add nbd0 to the VG on machine 1
pvmove the data from /dev/md10 to /dev/nbd0 (ie the md5 on machine2 via nbd)
remove /dev/md10 from the VG.



Hmm, maybe using LVM's mirror would help me here:

CONFIG_DM_MIRROR: 



Allow volume managers to mirror logical volumes, also 


 needed for live data migration tools such as 'pvmove'.


--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote:
 
 Now, is there a way I can synchronize the contents of RAID-10, or 
 /dev/md10, with the contents of RAID-5, or /dev/sdr, when /dev/sdr is 
 bigger then /dev/md10, and /dev/md10 has to be synchronized on /dev/sdr, 
 not the way around (I would expand the filesystem later with ext3online)?

Yes, you can do this.

You want a raid1 with no metadata, an external bitmap, and a
write-mostly, write-behind device.  Simple?

No metadata means no superblock is stored on the device.  You get to
use all of it for data, which is good if it is already all in use for
data.
The down-side of not having a superblock is that you have to track
which device is which yourself and after a restart, manually assemble
it.  But for a short-term situation, that isn't a big problem.

An external bitmap means that if the link goes down, it keeps track of
which blocks are in sync and which aren't, and when the link comes
back up you re-add the missing device and the rebuild continues where
it left off.

A write-mostly device means that it will never read over the slow link
(unless the local device dies).  It will only write over the slow
link.

A write-behind device means that writes to the array will be as fast
as writes to the local device.  There are limits to this.  Long-term,
the write throughput will be limited to the slowest device, but bursts
to the array will happen at the full speed of the local device.

How do you do this?  First make sure you have a recent kernel and
mdadm.

Then you build the array with

  mdadm --build /dev/md22 --level=1 --bitmap=/root/mybitmap \
 --write-behind --raid-disks=2 /dev/localdevice --write-mostly 
/dev/remotedevice

then use /dev/md22 instead of /dev/localdevice.
This will not destroy the contents of /dev/localdevice.  They will
remain unchanged.  But it will start copying those contents to /dev/remotedevice

Obviously you need to unmount /dev/localdevice first, then set up the
raid1, then mount /dev/md22 in it's place.

You will want to test this of course - I cannot guarantee that I have
that command exactly right.

If the remote device fails (link breaks) it should get marked faulty
in the array.  Once you have it working again you can use mdadm to
remove (--remove) and re-add (--re-add) the device.  It should
complete the copy correctly.   You should test that this actually
works as expected. 

Feel free to ask questions if you choose to attempt this route.

Good luck.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

Neil Brown schrieb:

(...)


You will want to test this of course - I cannot guarantee that I have
that command exactly right.

If the remote device fails (link breaks) it should get marked faulty
in the array.  Once you have it working again you can use mdadm to
remove (--remove) and re-add (--re-add) the device.  It should
complete the copy correctly.   You should test that this actually
works as expected. 


Feel free to ask questions if you choose to attempt this route.


Interesting lecture, thanks a lot. I'm gonna take that path.

I have one more question, though.

I want to migrate:
- from 4x400GB HDD RAID-10
- to 4x400GB HDD RAID-5

Obviously, I need 8 disks for that, but I have only 6.

So my idea was to kick one disk from RAID-10, and use it for RAID-5 for 
the time of migration.


This means, I would have only 3 disks out of 4 in RAID-5 array.


Is it possible to create a degraded, 4 disk RAID-5 array with just 3 
drives? I would add the 4th drive once migration from RAID-10 is done.



(I'm aware of the risks - that my degraded RAID-10 will be vulnerable 
during the migration).



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

Neil Brown schrieb:

(...)


An external bitmap means that if the link goes down, it keeps track of
which blocks are in sync and which aren't, and when the link comes
back up you re-add the missing device and the rebuild continues where
it left off.



  mdadm --build /dev/md22 --level=1 --bitmap=/root/mybitmap \
 --write-behind --raid-disks=2 /dev/localdevice --write-mostly 
/dev/remotedevice


One more question - is there a way to estimate the size of the bitmap 
file? Does it depend on the size of the array?


What bitmap file size can I expect for a 600 GB array?


--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote:
 Neil Brown schrieb:
 
 (...)
 
  An external bitmap means that if the link goes down, it keeps track of
  which blocks are in sync and which aren't, and when the link comes
  back up you re-add the missing device and the rebuild continues where
  it left off.
 
mdadm --build /dev/md22 --level=1 --bitmap=/root/mybitmap \
   --write-behind --raid-disks=2 /dev/localdevice --write-mostly 
  /dev/remotedevice
 
 One more question - is there a way to estimate the size of the bitmap 
 file? Does it depend on the size of the array?
 
 What bitmap file size can I expect for a 600 GB array?

Due to internal implementation details, mdadm limits the size of the
bitmap to 2^20 bits.  So the file will be 100K +/- 50%.
That will lead to about 1Meg per bit.  If you have a failure, this
might mean you end up re-copying several megabytes more than you
really need to, but that should add up to less than one second.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote:
 
 Interesting lecture, thanks a lot. I'm gonna take that path.

:-)

 
 I have one more question, though.
 
 I want to migrate:
 - from 4x400GB HDD RAID-10
 - to 4x400GB HDD RAID-5
 
 Obviously, I need 8 disks for that, but I have only 6.
 
 So my idea was to kick one disk from RAID-10, and use it for RAID-5 for 
 the time of migration.
 
 This means, I would have only 3 disks out of 4 in RAID-5 array.

Yes... that would work  How much do you trust your drives, and how
much to do value your data?  Presumably 

  perceived likely hood of failure * value of data  price of 2 drives.

 
 
 Is it possible to create a degraded, 4 disk RAID-5 array with just 3 
 drives? I would add the 4th drive once migration from RAID-10 is done.

Certainly.  When you create the raid5, use the word missing in place
of the drive which is ... missing.  This will create a degraded
array.  You the drive becomes available, simply add it in to the
array.
If you drives are not identical (same number of sectors), make sure
the last drive you add is not the smallest, else you won't be able to
add it.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Tomasz Chmielewski

Neil Brown schrieb:

On Tuesday May 15, [EMAIL PROTECTED] wrote:

Neil Brown schrieb:

(...)


An external bitmap means that if the link goes down, it keeps track of
which blocks are in sync and which aren't, and when the link comes
back up you re-add the missing device and the rebuild continues where
it left off.
  mdadm --build /dev/md22 --level=1 --bitmap=/root/mybitmap \
 --write-behind --raid-disks=2 /dev/localdevice --write-mostly 
/dev/remotedevice
One more question - is there a way to estimate the size of the bitmap 
file? Does it depend on the size of the array?


What bitmap file size can I expect for a 600 GB array?


Due to internal implementation details, mdadm limits the size of the
bitmap to 2^20 bits.  So the file will be 100K +/- 50%.
That will lead to about 1Meg per bit.  If you have a failure, this
might mean you end up re-copying several megabytes more than you
really need to, but that should add up to less than one second.


Good, I was wondering if ~200 MB left I have on a filesystem would be 
enough :)



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Gregory Seidman
On Tue, May 15, 2007 at 11:27:16AM +0200, Tomasz Chmielewski wrote:
 Peter Rabbitson schrieb:
 Tomasz Chmielewski wrote:
 I have a RAID-10 setup of four 400 GB HDDs. As the data grows by several
 GBs a day, I want to migrate it somehow to RAID-5 on separate disks in a
 separate machine.
 
 Which would be easy, if I didn't have to do it online, without stopping
 any services.
 
 
 
 Your /dev/md10 - what is directly on top of it? LVM? XFS? EXT3?
 
 Good point. I don't want to copy the whole RAID-10.
 I want to copy only one LVM-2 volume (which is like 90% of that RAID-10, 
 anyway).
 
 So I want to synchronize /dev/LVM2/my-volume (ext3) with /dev/sdr (now 
 empty; bigger than /dev/LVM2/my-volume).
[...]

This actually makes it quite a bit easier, assuming you have some
reasonable amount of unallocated space in your LVM volume group. For the
purpose of the commands below, I'm going to call that amount FREESPACE.
Do the following (as root) on the machine with the LVM2 VG:

lvcreate -L FREESPACE -p r -n copy-me -s LVM2/my-volume

This now gives you an unchanging, read-only device /dev/LVM2/copy-me which
you can then dd over netcat or the like to an equal-sized volume on your
destination RAID. Let that finish (I'm sure it will take some time).

Once you're done with that step, run lvdisplay LVM2/copy-me and take a look
at the Allocated to snapshot line. That is a percentage of FREESPACE
that is different between the current live volume and the snapshot. Do the
multiplication and figure out how much storage is actually different, then
get rid of the snapshot with:

lvremove -f LVM2/copy-me

You should know how long you can reasonably have things down. To get an
idea of how long X amount of data will take to rsync, make another lvm
snapshot with the lvcreate commandline above and rsync between the two
actual devices (i.e. not the filesystems, but /dev/LVM2/copy-me and the
destination volume's block device). 

You now have to decide if the amount of data difference will take longer
than the downtime you can accept. If so, repeat the process of
snapshot, rsync, lvdisplay/calculate until it gets to an acceptable level
or you reach a point of diminishing returns. From there, you take down your
services and such, unmount the live LVM volume, rsync one more time, and
bring stuff up on the new RAID.

Depending on how many loops you have to go through this may take some
time and effort, but it will minimize your downtime (which, it seems, is
your goal). Be sure to lvremove the snapshot every time, since you are
reusing the space allocated to it for the next snapshot. Also note that if
you are writing to the live partition faster than you can rsync it, I see
no way to avoid significant downtime.

 Tomasz Chmielewski
--Greg

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html