Re: A raid in a raid.

2007-07-21 Thread Michael Tokarev
mullaly wrote:
[]
 All works well until a system reboot. md2 appears to be brought up before
 md0 and md1 which causes the raid to start without two of its drives.
 
 Is there anyway to fix this?

How about listing the arrays in proper order in mdadm.conf ?

/mjt

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Partitions with == or \approx same size ?

2007-07-21 Thread Seb

Hello Neil,


Thanks for the confirmations!
It's all very clear now. Case closed.


Best,
Seb.

On Sat, 21 Jul 2007, Neil Brown wrote:

| On Saturday July 21, [EMAIL PROTECTED] wrote:
| 
|  Hi Neil,
| 
| 
|  |  Could you tell me if such a mechanism exists in mdadm?
|  |  Or should I accept the loss of the 150 GB?
|  | When you give mdadm a collection of drives to turn into a RAID array,
|  | use bases the size of the array on the smallest device.
| 
|  I'm sorry I don't know what bases are in a RAID array and I can't find
|  this term in the man page. Could you elaborate?
|
| Typo.  Should be
| It bases the size of the array ...
|
| i.e. it works out which is the smaller device, and uses that size to
| determine the size of the array.  e.g. if you are making a raid5 with
| 4 drives, then the array will be 3 times the size of the smallest array.
|
| 
|  | You might want to make it a little smaller still in case you have to
|  | replace a device with a slightly smaller device (it happens).  You can
|  | use --size to reduce the used space a little further if you like.
| 
|  Thanks for the pointer to --size! I had overlooked this option. The man
|  page says that If  this is not specified (as it normally is not) the
|  smallest drive (or partition) sets the size. This implies that partitions
|  need not have exactly the same size and 'mdadm' will still manage.
|
| Exactly.
|
| 
|  So I'll use 249,9GB out of 250GB, skip over the small resulting
|  differences, let mdadm work its magic and when new disks will be inserted
|  after a failure it will suffice to use their total space.
|
| Again, exactly correct.
|
| NeilBrown
|
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


3ware Auto-Carve Question

2007-07-21 Thread Justin Piszcz
Quick question-- under Kernel 2.4 without 2TB support enabled, the only 
other option is to use auto-carving to get the maximum amount of space 
easily, however, after doing this (2TB, 2TB, 1.8TB) for a 10 x 750GB 
array, only the first partition remains afer reboot.


Before reboot:

/dev/sdb1 (2TB)
/dev/sdc1 (2TB)
/dev/sdd1 (2TB)

After reboot

/dev/sdb1 (2TB)
sdc1 vanished
sdd1 vanished

I also tried formatting and then labeling /dev/sdc1 and /dev/sdd1 with 
mkfs.ext3 and then e2label, same thing, sdc/sdd disappear after reboot. 
Is there some option somewhere to ensure that sdc/sdd 'carved partitions' 
are not forgotten about?


Thanks,

Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3ware Auto-Carve Question

2007-07-21 Thread Justin Piszcz



On Sat, 21 Jul 2007, Justin Piszcz wrote:

Quick question-- under Kernel 2.4 without 2TB support enabled, the only other 
option is to use auto-carving to get the maximum amount of space easily, 
however, after doing this (2TB, 2TB, 1.8TB) for a 10 x 750GB array, only the 
first partition remains afer reboot.


Before reboot:

/dev/sdb1 (2TB)
/dev/sdc1 (2TB)
/dev/sdd1 (2TB)

After reboot

/dev/sdb1 (2TB)
sdc1 vanished
sdd1 vanished

I also tried formatting and then labeling /dev/sdc1 and /dev/sdd1 with 
mkfs.ext3 and then e2label, same thing, sdc/sdd disappear after reboot. Is 
there some option somewhere to ensure that sdc/sdd 'carved partitions' are 
not forgotten about?


Thanks,

Justin.



Ah this appears to be my issue:

http://www.3ware.com/KB/article.aspx?id=14177

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3ware Auto-Carve Question

2007-07-21 Thread Justin Piszcz



On Sat, 21 Jul 2007, Justin Piszcz wrote:




On Sat, 21 Jul 2007, Justin Piszcz wrote:

Quick question-- under Kernel 2.4 without 2TB support enabled, the only 
other option is to use auto-carving to get the maximum amount of space 
easily, however, after doing this (2TB, 2TB, 1.8TB) for a 10 x 750GB array, 
only the first partition remains afer reboot.


Before reboot:

/dev/sdb1 (2TB)
/dev/sdc1 (2TB)
/dev/sdd1 (2TB)

After reboot

/dev/sdb1 (2TB)
sdc1 vanished
sdd1 vanished

I also tried formatting and then labeling /dev/sdc1 and /dev/sdd1 with 
mkfs.ext3 and then e2label, same thing, sdc/sdd disappear after reboot. Is 
there some option somewhere to ensure that sdc/sdd 'carved partitions' are 
not forgotten about?


Thanks,

Justin.



Ah this appears to be my issue:

http://www.3ware.com/KB/article.aspx?id=14177

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Hm, did not appear to help.

The card is a 9550SXU-12- any thoughts?

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Need clarification on raid1 resync behavior with bitmap support

2007-07-21 Thread Mike Snitzer

On 6/1/06, NeilBrown [EMAIL PROTECTED] wrote:


When an array has a bitmap, a device can be removed and re-added
and only blocks changes since the removal (as recorded in the bitmap)
will be resynced.


Neil,

Does the same apply when a bitmap-enabled raid1's member goes faulty?
Meaning even if a member is faulty, when the user removes and re-adds
the faulty device the raid1 rebuild _should_ leverage the bitmap
during a resync right?

I've seen messages like:
[12068875.690255] raid1: raid set md0 active with 2 out of 2 mirrors
[12068875.690284] md0: bitmap file is out of date (0  1) -- forcing
full recovery
[12068875.690289] md0: bitmap file is out of date, doing full recovery
[12068875.710214] md0: bitmap initialized from disk: read 5/5 pages,
set 131056 bits, status: 0
[12068875.710222] created bitmap (64 pages) for device md0

Could you share the other situations where a bitmap-enabled raid1
_must_ perform a full recovery?
- Correct me if I'm wrong, but one that comes to mind is when a server
reboots (after cleanly stopping a raid1 array that had a faulty
member) and then either:
1) assembles the array with the previously faulty member now available

2) assembles the array with the same faulty member missing.  The user
later re-adds the faulty member

AFAIK both scenarios would bring about a full resync.

regards,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] Partion table recovery

2007-07-21 Thread Willy Tarreau
On Fri, Jul 20, 2007 at 08:35:45AM +0100, Anton Altaparmakov wrote:
 On 20 Jul 2007, at 06:13, Al Boldi wrote:
 As always, a good friend of mine managed to scratch my partion  
 table by
 cat'ing /dev/full into /dev/sda.  I was able to push him out of the  
 way, but
 at least the first 100MB are gone.  I can probably live without the  
 first
 partion, but there are many partitions after that, which I hope should
 easily be recoverable.
 
 I tried parted, but it's not working out for me.  Does anybody know  
 of a
 simple partition recovery tool, that would just scan the disk for lost
 partions?
 
 parted and its derivatives are pile of crap...  They cause corruption  
 to totally healthy systems at the best of times.  Don't go near them.
 
 Use TestDisk (http://www.cgsecurity.org/wiki/TestDisk) and be happy.   
 (-:

Thanks for this link Anton, it looks awesome and I'll add it to my
collection :-)

I mostly use vche (virtual console hex editor) to look for strings, find
offsets and/or fix data by hand. This should make me save some time !

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pata_via with software raid1: attempt to access beyond end of device

2007-07-21 Thread Dâniel Fraga
On Fri, 20 Jul 2007 12:59:41 +1000
Neil Brown [EMAIL PROTECTED] wrote:

 So reiserfs thinks the devices is 64K larger than it really is. I
 wonder how that happened.

resize_reiserfs -s -64K /dev/md1

Neil, I discovered how the 64k larger issue happened...

It was my fault.. I created the Reiser filesystem *before* the
raid device. So when I create the raid device, it warns about a
existing reiserfs filesystem and if I want to continue. This way, the
underlying filesystem will be always 64k large than the raid device.

If, instead, I use the correct method of creating the raid
device *before*, everything is ok, because then I can mkreiserfs the
raid device and the filesystem will have the correct size the raid
device expects.

Sorry for my mistake and thanks again!

-- 
Linux 2.6.22: Holy Dancing Manatees, Batman!
http://www.lastfm.pt/user/danielfraga
http://u-br.net
Marilyn Manson - Dried Up, Tied and Dead to the World (Antichrist
Superstar - 1996)

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5:md3: read error corrected , followed by , Machine Check

2007-07-21 Thread Mr. James W. Laferriere

Hello Andrew ,

On Tue, 17 Jul 2007, Andrew Burgess wrote:

The 'MCE's have been ongoing for sometime .  I have replaced every item
in the system except the chassis  scsi backplane  power supply(750Watts) .
Everything .  MB,cpu,memory,scsi controllers, ...
These MCE's only happen when I am trying to build or bonnie++ test the
md3 .  It consists of (now 7+1spare) 146GB drives in the SuperMicro
SYS-6035B-8B's backplane attached to a LSI22320 .


Probably every old timer has a story about chasing a hardware problem
where changing the power supply finally fixed it. I keep spares now.

If an MCE (which means bad cpu) doesn't go away after changing the cpu
it would either have to be temperature, power or a bug in the MCE code.
What else could it be?


	Thank you for the idea of 'changing out the PS' .  So I did it a bit 
differant .  I removed the system PS from the raid backplane  dropped in a 
known good ps of proper wattage  re-tested .  But left the systems ps attached 
to only the MB  fans .
	It doesn't appear to be power load related .  I tried rebuilding my 7 
disk raid6 array  I got the same thing ,  MCE .
	Now the raid backplane is still in the air stream in front of the cpu's 
and memory slots .  So it could be a marginal cpu or memory stick .


	But here's the clincher ,  when I don't use the two drives in from of 
the PS  cpu  memory slots .  The array completes it's resync .  So I'm back to 
testing memory (again) ,  If that passes then I'll try the new cpu(s) route .


Tnx All ,  JimL
--
+-+
| James   W.   Laferriere | System   Techniques | Give me VMS |
| NetworkEngineer | 663  Beaumont  Blvd |  Give me Linux  |
| [EMAIL PROTECTED] | Pacifica, CA. 94044 |   only  on  AXP |
+-+
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID and filesystem setup / tuning questions regarding specific scenario

2007-07-21 Thread Michal Soltys
I'm making raid 5 consisting of 4 disks, 500 GB each, with one spare 
standby. In future to be expanded with extra disks. Server will be running 
with ups, and with extra machine doing daily rsync backup of the system and 
stuff deemed important. On top of the raid, there will probably be simple 
lvm2 setup - linear with large extents (512 MB or 1 GB). Files will be 
accessed primarily through samba shares, by 10 - 20 people at the same time. 
Server will have 2GB ram (increasing it, if it's necessary, won't be a 
problem), core2 duo cpu, running 64bit linux. Disks are on ICH9R, in ahci mode.


Currently, users have ~200 GB partition, almost filled up, with ca. 700,000 
files - which gives rather small value ~300 KB / file. From what they say, 
it will be more or less the same on the new machine.



The two basic questions - raid parameters and filesystem choice. I'll of 
course make tests, but potential combination of parameters is not so small, 
so I'm looking for starting tips.



Regarding parameters - I've googled / searched this list, and found some
suggestions regarding parameters like chunk_size, nr_requests, read ahead 
setting, etc. Still I have feeling, that suggested settings are rather 
tailored more towards big filesizes - and it's certainly not the case here. 
Wouldn't values - like 64 MB read ahead on whole raid device, 1MB chunks - a 
bit overkill in my scenario ? Maybe someone is running something similar, 
and could provide some insights ?



Regarding filesystem - I've thought about actually using ext3 without the 
journal (so effectively ext2), considering ups and regular backups to 
separate machine. Also, judging from what I've found so far, XFS is not that 
great in case of many small files. As for ext2/3, googling revealed 
documents like http://ext2.sourceforge.net/2005-ols/paper-html/index.html , 
which are quite encouraging (besides, document is already 2 years old). 
Another advantage of ext2/3, is that it can be shrinked, whereas XFS cannot,

afaik.

How much of a performance gain would I get from using journalless ext3 ? 
Either way, is testing ext2/3 worthwhile here, or should I jump right to XFS ?



Misc. more specific questions:

- lvm2

In my scenario - linear, big extents - is there a [significant] performance 
hit coming from using it ? Any other caveats I should be aware of ?


- read ahead

All the elements of the final setup - drives themselves, RAID, LVM2 - have 
read ahead parameter. How should they be set, relatively to each other ? 
Also, basing on  http://www.rhic.bnl.gov/hepix/talks/041019pm/schoen.pdf - 
too big read ahead can hurt performance in multiuser setup. And as mentioned 
above - 64 MB seems a bit crazy... Availabilty of memory is not an issue though.


Any other stacking parameters that should be set properly ?

- nr_requests

I've also found suggestion about increasing this parameter - to 256 or 512, 
in http://marc.info/?l=linux-raidm=118302016516950w=2 . It was also 
mentioned in some 3ware technical doc. Any comments on that ?


- max_sectors_kb

Any suggestions regarding this one ? I've found suggestions to set it to 
chunk size, but it seems strange (?)


- journal on separate devices

Generaly - how fast should be the extra drive, compared to the RAID array ? 
Also how big ? Any side effects of setting it big (on ext3, there's limit 
when the journal is set on the main partition due to memory requirements; 
but there's no limit when it's on separate drive) ?



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] Partition table recovery

2007-07-21 Thread Al Boldi
Theodore Tso wrote:
 On Sat, Jul 21, 2007 at 07:54:14PM +0200, Rene Herman wrote:
  sfdisk -d already works most of the time. Not as a verbatim tool (I
  actually semi-frequently use a sfdisk -d /dev/hda | sfdisk invocation
  as a way to _rewrite_ the CHS fields to other values after changing
  machines around on a disk) but something you'd backup on the FS level
  should, in my opinion, need to be less fragile than would be possible
  with just 512 bytes available.

 *IF* you remember to store the sfdisk -d somewhere useful.  In my How
 To Recover From Hard Drive Catastrophies classes, I tell them to
 print out a copy of sfdisk -l /dev/hda ; sfdisk -d /dev/hda and tape
 it to the side of the computer.  I also tell them do regular backups.
 What to make a guess how many them actually follow this good advice?
 Far fewer than I would like, I suspect...

 What I'm suggesting is the equivalent of sfdisk -d, except we'd be
 doing it automatically without requiring the user to take any kind of
 explicit action.  Is it perfect?  No, although the edge conditions are
 quite rare these days and generally involve users using legacy systems
 and/or doing Weird Shit such that They Really Should Know To Do Their
 Own Explicit Backups.  But for the novice users, it should work Just
 Fine.

Sounds great, but it may be advisable to hook this into the partition 
modification routines instead of mkfs/fsck.  Which would mean that the 
partition manager could ask the kernel to instruct its fs subsystem to 
update the backup partition table for each known fs-type that supports such 
a feature.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html