Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Neil Brown
On Wednesday October 24, [EMAIL PROTECTED] wrote:
 Current mdadm.conf:
 DEVICE partitions
 ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 
 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part
 
 still have the problem where on boot one drive is not part of the 
 array.  Is there a log file I can check to find out WHY a drive is not 
 being added?  It's been a while since the reboot, but I did find some 
 entries in dmesg - I'm appending both the md lines and the physical disk 
 related lines.  The bottom shows one disk not being added (this time is 
 was sda) - and the disk that gets skipped on each boot seems to be 
 random - there's no consistent failure:

Odd but interesting.
Does it sometimes fail to start the array altogether?

 md: md0 stopped.
 md: md0 stopped.
 md: bindsdc
 md: bindsdd
 md: bindsdb
 md: md0: raid array is not clean -- starting background reconstruction
 raid10: raid set md0 active with 3 out of 4 devices
 md: couldn't update array info. -22
  ^^^

This is the most surprising line, and hence the one most likely to
convey helpful information.

This message is generated when a process calls SET_ARRAY_INFO on an
array that is already running, and the changes implied by the new
array_info are not supportable.

The only way I can see this happening is if two copies of mdadm are
running at exactly the same time and are both are trying to assemble
the same array.  The first calls SET_ARRAY_INFO and assembles the
(partial) array.  The second calls SET_ARRAY_INFO and gets this error.
Not all devices are included because while when one mdadm when to
look, at a device, the other has it locked and so the first just
ignored it.

I just tried that, and sometimes it worked, but sometimes it assembled
with 3 out of 4 devices.  I didn't get the couldn't update array info
message, but that doesn't prove I'm wrong.

I cannot imagine how that might be happening (two at once) unless
maybe 'udev' had been configured to do something as soon as devices
were discovered seems unlikely.

It might be worth finding out where mdadm is being run in the init
scripts and add a -v flag, and redirecting stdout/stderr to some log
file.
e.g.
   mdadm -As  -v  /var/log/mdadm-$$ 21

And see if that leaves something useful in the log file.

BTW, I don't think your problem has anything to do with the fact that
you are using whole partitions.
While it is debatable whether that is a good idea or not (I like the
idea, but Doug doesn't and I respect his opinion) I doubt it would
contribute to the current problem.


Your description makes me nearly certain that there is some sort of
race going on (that is the easiest way to explain randomly differing
behaviours).   The race is probably between different code 'locking'
(opening with O_EXCL) the various devices.  Give the above error
message, two different 'mdadm's seems most likely, but an mdadm and a
mount-by-label scan could probably do it too.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Doug Ledford
On Wed, 2007-10-24 at 22:43 -0700, Daniel L. Miller wrote:
 Bill Davidsen wrote:
  Daniel L. Miller wrote:
  Current mdadm.conf:
  DEVICE partitions
  ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 
  UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part
 
  still have the problem where on boot one drive is not part of the 
  array.  Is there a log file I can check to find out WHY a drive is 
  not being added?  It's been a while since the reboot, but I did find 
  some entries in dmesg - I'm appending both the md lines and the 
  physical disk related lines.  The bottom shows one disk not being 
  added (this time is was sda) - and the disk that gets skipped on each 
  boot seems to be random - there's no consistent failure:
 
  I suspect the base problem is that you are using whole disks instead 
  of partitions, and the problem with the partition table below is 
  probably an indication that you have something on that drive which 
  looks like a partition table but isn't. That prevents the drive from 
  being recognized as a whole drive. You're lucky, if the data looked 
  enough like a partition table to be valid the o/s probably would have 
  tried to do something with it.
  [...]
  This may be the rare case where you really do need to specify the 
  actual devices to get reliable operation.
 OK - I'm officially confused now (I was just unofficially before).  WHY 
 is it a problem using whole drives as RAID components?  I would have 
 thought that building a RAID storage unit with identically sized drives 
 - and using each drive's full capacity - is exactly the way you're 
 supposed to!

As much as anything else this can be summed up as you are thinking of
how you are using the drives and not how unexpected software on your
system might try and use your drives.  Without a partition table, none
of the software on your system can know what to do with the drives
except mdadm when it finds an md superblock.  That doesn't stop other
software from *trying* to find out how to use your drives though.  That
includes the kernel trying to look for a valid partition table, mount
possibly scanning the drive for a file system label, lvm scanning for an
lvm superblock, mtools looking for a dos filesystem, etc.  Under normal
conditions, the random data on your drive will never look valid to these
other pieces of software.  But, once in a great while, it will look
valid.  And that's when all hell breaks loose.  Or worse, you run a
partition program such as fdisk on the device and it initializes the
partition table (something that the Fedora/RHEL installers do to all
disks without partition tables...well, the installer tells you there's
no partition table and asks if you want to initialize it, but if someone
is in a hurry and hits yes when they meant no, bye bye data).

The partition table is the single, (mostly) universally recognized
arbiter of what possible data might be on the disk.  Having a partition
table may not make mdadm recognize the md superblock any better, but it
keeps all that other stuff from even trying to access data that it
doesn't have a need to access and prevents random luck from turning your
day bad.

Oh, and let's not go into what can happen if you're talking about a dual
boot machine and what Windows might do to the disk if it doesn't think
the disk space is already spoken for by a linux partition.

And, in particular with mdadm, I once created a full disk md raid array
on a couple disks, then couldn't get things arranged like I wanted, so I
just partitioned the disks and then created new arrays in the partitions
(without first manually zeroing the superblock for the whole disk
array).  Since I used a version 1.0 superblock on the whole disk array,
and then used version 1.1 superblocks in the partitions, the net result
was that when I ran mdadm -Eb, mdadm would find both the 1.1 and 1.0
superblocks in the last partition on the disk.  Confused both myself and
mdadm for a while.

Anyway, I happen to *like* the idea of using full disk devices, but the
reality is that the md subsystem doesn't have exclusive ownership of the
disks at all times, and without that it really needs to stake a claim on
the space instead of leaving things to chance IMO.

   I should mention that the boot/system drive is IDE, and 
 NOT part of the RAID.  So I'm not worried about losing the system - but 
 I AM concerned about the data.  I'm using four drives in a RAID-10 
 configuration - I thought this would provide a good blend of safety and 
 performance for a small fileserver.
 
 Because it's RAID-10 - I would ASSuME that I can drop one drive (after 
 all, I keep booting one drive short), partition if necessary, and add it 
 back in.  But how would splitting these disks into partitions improve 
 either stability or performance?

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  

Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Doug Ledford
On Thu, 2007-10-25 at 16:12 +1000, Neil Brown wrote:

  md: md0 stopped.
  md: md0 stopped.
  md: bindsdc
  md: bindsdd
  md: bindsdb
  md: md0: raid array is not clean -- starting background reconstruction
  raid10: raid set md0 active with 3 out of 4 devices
  md: couldn't update array info. -22
   ^^^
 
 This is the most surprising line, and hence the one most likely to
 convey helpful information.
 
 This message is generated when a process calls SET_ARRAY_INFO on an
 array that is already running, and the changes implied by the new
 array_info are not supportable.
 
 The only way I can see this happening is if two copies of mdadm are
 running at exactly the same time and are both are trying to assemble
 the same array.  The first calls SET_ARRAY_INFO and assembles the
 (partial) array.  The second calls SET_ARRAY_INFO and gets this error.
 Not all devices are included because while when one mdadm when to
 look, at a device, the other has it locked and so the first just
 ignored it.

If mdadm copy A gets three of the devices, I wouldn't think mdadm copy B
would have been able to get enough devices to decide to even try and
assemble the array (assuming that once copy A locked the devices during
open, that it then held the devices until time to assemble the array).

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


Re: Time to deprecate old RAID formats?

2007-10-25 Thread Doug Ledford
On Thu, 2007-10-25 at 09:55 +1000, Neil Brown wrote:

 As for where the metadata should be placed, it is interesting to
 observe that the SNIA's DDFv1.2 puts it at the end of the device.
 And as DDF is an industry standard sponsored by multiple companies it
 must be ..
 Sorry.  I had intended to say correct, but when it came to it, my
 fingers refused to type that word in that context.
 
 DDF is in a somewhat different situation though.  It assumes that the
 components are whole devices, and that the controller has exclusive
 access - there is no way another controller could interpret the
 devices differently before the DDF controller has a chance.

Putting a superblock at the end of a device works around OS
compatibility issues and other things related to transitioning the
device from part of an array to not, etc.  But, it works if and only if
you have the guarantee you mention.  Long, long ago I tinkered with the
idea of md multipath devices using an end of device superblock on the
whole device to allow reliable multipath detection and autostart,
failover of all partitions on a device when a command to any partition
failed, ability to use standard partition tables, etc. while being 100%
transparent to the rest of the OS.  The second you considered FC
connected devices and multi-OS access, that fell apart in a big way.
Very analogous.

So, I wouldn't necessarily call it wrong, but it's fragile.

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


deleting mdadm array?

2007-10-25 Thread Janek Kozicki
Hello,

I just created a new array /dev/md1 like this:

mdadm --create --verbose /dev/md1 --chunk=64 --level=raid5 \
   --metadata=1.1  --bitmap=internal \
   --raid-devices=3 /dev/hdc2 /dev/sda2 missing


But later I changed my mind, and I wanted to use chunk 128. Do I need
to delete this array somehow first, or can I just create an array
again (overwriting the current one)?

-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-25 Thread David Greaves
Jeff Garzik wrote:
 Neil Brown wrote:
 As for where the metadata should be placed, it is interesting to
 observe that the SNIA's DDFv1.2 puts it at the end of the device.
 And as DDF is an industry standard sponsored by multiple companies it
 must be ..
 Sorry.  I had intended to say correct, but when it came to it, my
 fingers refused to type that word in that context.

 For the record, I have no intention of deprecating any of the metadata
 formats, not even 0.90.
 
 strongly agreed

I didn't get a reply to my suggestion of separating the data and location...

ie not talking about superblock versions 0.9, 1.0, 1.1, 1.2 etc but a data
format (0.9 vs 1.0) and a location (end,start,offset4k)?

This would certainly make things a lot clearer to new (and old!) users:

mdadm --create /dev/md0 --metadata 1.0 --meta-location offset4k
or
mdadm --create /dev/md0 --metadata 1.0 --meta-location start
or
mdadm --create /dev/md0 --metadata 1.0 --meta-location end

resulting in:
mdadm --detail /dev/md0

/dev/md0:
Version : 01.0
  Metadata-locn : End-of-device
  Creation Time : Fri Aug  4 23:05:02 2006
 Raid Level : raid0

You provide rational defaults for mortals and this approach allows people like
Doug to do wacky HA things explicitly.

I'm not sure you need any changes to the kernel code - probably just the docs
and mdadm.

 It is conceivable that I could change the default, though that would
 require a decision as to what the new default would be.  I think it
 would have to be 1.0 or it would cause too much confusion.
 
 A newer default would be nice.

I also suspect that a *lot* of people will assume that the highest superblock
version is the best and should be used for new installs etc.

So if you make 1.0 the default then how many users will try 'the bleeding edge'
and use 1.2? So then you have 1.3 which is the same as 1.0? H? So to quote
from an old Soap: Confused, you  will be...

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: deleting mdadm array?

2007-10-25 Thread Neil Brown
On Thursday October 25, [EMAIL PROTECTED] wrote:
 Hello,
 
 I just created a new array /dev/md1 like this:
 
 mdadm --create --verbose /dev/md1 --chunk=64 --level=raid5 \
--metadata=1.1  --bitmap=internal \
--raid-devices=3 /dev/hdc2 /dev/sda2 missing
 
 
 But later I changed my mind, and I wanted to use chunk 128. Do I need
 to delete this array somehow first, or can I just create an array
 again (overwriting the current one)?

Just recreate with new values, overwriting the current one.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: deleting mdadm array?

2007-10-25 Thread David Greaves
Janek Kozicki wrote:
 Hello,
 
 I just created a new array /dev/md1 like this:
 
 mdadm --create --verbose /dev/md1 --chunk=64 --level=raid5 \
--metadata=1.1  --bitmap=internal \
--raid-devices=3 /dev/hdc2 /dev/sda2 missing
 
 
 But later I changed my mind, and I wanted to use chunk 128. Do I need
 to delete this array somehow first, or can I just create an array
 again (overwriting the current one)?

How much later? This will, of course, destroy any data on the array (!) and
you'll need to mkfs again...


To answer the question though: just run mdadm again to create a new array with
new parameters.


I think the only time you need to 'delete' an array before creating a new one is
if you change the superblock version since it quietly writes different
superblocks to different disk locations you may end up with 2 superblocks on the
disk and then you get confusion :)
(I'm not sure if mdadm is clever about this though...)

Also, if you don't mind me asking: why did you choose version 1.1 for the
metadata/superblock version?

David

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: deleting mdadm array?

2007-10-25 Thread Neil Brown
On Thursday October 25, [EMAIL PROTECTED] wrote:
 I think the only time you need to 'delete' an array before creating a new one 
 is
 if you change the superblock version since it quietly writes different
 superblocks to different disk locations you may end up with 2 superblocks on 
 the
 disk and then you get confusion :)
 (I'm not sure if mdadm is clever about this though...)
 

Mdadm tries to be clever.

When creating an array, it zeros any superblocks that it finds on the
array in any of the expected locations.
And when guessing the metadata format used, if it find two or more, it
chooses the one with the more recent create timestamp.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Daniel L. Miller

Neil Brown wrote:

It might be worth finding out where mdadm is being run in the init
scripts and add a -v flag, and redirecting stdout/stderr to some log
file.
e.g.
   mdadm -As  -v  /var/log/mdadm-$$ 21

And see if that leaves something useful in the log file.

  
I haven't rebooted yet, but here's my /etc/udev/rules.d/70-mdadm.rules 
file (BTW - running on Ubuntu 7.10 Gutsy):


SUBSYSTEM==block, ACTION==add|change, 
ENV{ID_FS_TYPE}==linux_raid*, RUN+=watershed -i udev-mdadm 
/sbin/mdadm -As -v  /var/log/mdadm-$$ 21


# This next line (only) is put into the initramfs,
#  where we run a strange script to activate only some of the arrays
#  as configured, instead of mdadm -As:
#initramfs# SUBSYSTEM==block, ACTION==add|change, 
ENV{ID_FS_TYPE}==linux_raid*, RUN+=watershed -i udev-mdadm 
/scripts/local-top/mdadm from-udev



Could that initramfs line be causing the problem?
--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Bill Davidsen

Neil Brown wrote:

On Wednesday October 24, [EMAIL PROTECTED] wrote:
  

Current mdadm.conf:
DEVICE partitions
ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 
UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part


still have the problem where on boot one drive is not part of the 
array.  Is there a log file I can check to find out WHY a drive is not 
being added?  It's been a while since the reboot, but I did find some 
entries in dmesg - I'm appending both the md lines and the physical disk 
related lines.  The bottom shows one disk not being added (this time is 
was sda) - and the disk that gets skipped on each boot seems to be 
random - there's no consistent failure:



Odd but interesting.
Does it sometimes fail to start the array altogether?

  

md: md0 stopped.
md: md0 stopped.
md: bindsdc
md: bindsdd
md: bindsdb
md: md0: raid array is not clean -- starting background reconstruction
raid10: raid set md0 active with 3 out of 4 devices
md: couldn't update array info. -22


  ^^^

This is the most surprising line, and hence the one most likely to
convey helpful information.

This message is generated when a process calls SET_ARRAY_INFO on an
array that is already running, and the changes implied by the new
array_info are not supportable.

The only way I can see this happening is if two copies of mdadm are
running at exactly the same time and are both are trying to assemble
the same array.  The first calls SET_ARRAY_INFO and assembles the
(partial) array.  The second calls SET_ARRAY_INFO and gets this error.
Not all devices are included because while when one mdadm when to
look, at a device, the other has it locked and so the first just
ignored it.

I just tried that, and sometimes it worked, but sometimes it assembled
with 3 out of 4 devices.  I didn't get the couldn't update array info
message, but that doesn't prove I'm wrong.

I cannot imagine how that might be happening (two at once) unless
maybe 'udev' had been configured to do something as soon as devices
were discovered seems unlikely.

It might be worth finding out where mdadm is being run in the init
scripts and add a -v flag, and redirecting stdout/stderr to some log
file.
e.g.
   mdadm -As  -v  /var/log/mdadm-$$ 21

And see if that leaves something useful in the log file.

BTW, I don't think your problem has anything to do with the fact that
you are using whole partitions.
  


You don't think the unknown partition table on sdd is related? Because 
I read that as a sure indication that the system isn't considering the 
drive as one without a partition table, and therefore isn't looking for 
the superblock on the whole device. And as Doug pointed out, once you 
decide that there is a partition table lots of things might try to use it.

While it is debatable whether that is a good idea or not (I like the
idea, but Doug doesn't and I respect his opinion) I doubt it would
contribute to the current problem.


Your description makes me nearly certain that there is some sort of
race going on (that is the easiest way to explain randomly differing
behaviours).   The race is probably between different code 'locking'
(opening with O_EXCL) the various devices.  Give the above error
message, two different 'mdadm's seems most likely, but an mdadm and a
mount-by-label scan could probably do it too.
  

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-25 Thread Bill Davidsen

Neil Brown wrote:

I certainly accept that the documentation is probably less that
perfect (by a large margin).  I am more than happy to accept patches
or concrete suggestions on how to improve that.  I always think it is
best if a non-developer writes documentation (and a developer reviews
it) as then it is more likely to address the issues that a
non-developer will want to read about, and in a way that will make
sense to a non-developer. (i.e. I'm to close to the subject to write
good doco).


Patches against what's in 2.6.4 I assume? I can't promise to write 
anything which pleases even me, but I will take a look at it.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Multipath and nbd

2007-10-25 Thread Bill Davidsen
I am at the design stage for a new server. That's when you try to 
convince a client that they have an unfavorable ratio of requirements to 
budget. I am thinking a raid-1, with a mirror to an nbd device running 
write-mostly. I will have redundant network paths to the other machine, 
one via a dedicated cable and P-t-P connection, and the other via a 
switch going to the general internal network.


Am I going into deep waters to try and do something helpful with 
multipath features here, or should I stick to a network only solution, 
and give the switched route a higher metric to force traffic through the 
dedicated link?


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-25 Thread David Greaves
Bill Davidsen wrote:
 Neil Brown wrote:
 I certainly accept that the documentation is probably less that
 perfect (by a large margin).  I am more than happy to accept patches
 or concrete suggestions on how to improve that.  I always think it is
 best if a non-developer writes documentation (and a developer reviews
 it) as then it is more likely to address the issues that a
 non-developer will want to read about, and in a way that will make
 sense to a non-developer. (i.e. I'm to close to the subject to write
 good doco).
 
 Patches against what's in 2.6.4 I assume? I can't promise to write
 anything which pleases even me, but I will take a look at it.
 

The man page is a great place for describing, eg, the superblock location; but
don't forget we have
  http://linux-raid.osdl.org/index.php/Main_Page
which is probably a better place for *discussions* (or essays) about the
superblock location (eg the LVM / v1.1 comment Janek picked up on)

In fact I was going to take some of the writings from this thread and put them
up there.

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Daniel L. Miller

Bill Davidsen wrote:
You don't think the unknown partition table on sdd is related? 
Because I read that as a sure indication that the system isn't 
considering the drive as one without a partition table, and therefore 
isn't looking for the superblock on the whole device. And as Doug 
pointed out, once you decide that there is a partition table lots of 
things might try to use it.  
Now, would the drive letters (sd[a-d]) change from reboot-to-reboot?  
Because it's not consistent - so far I've seen each of the four drives 
at one time or another fail during the boot.


I've added the verbose logging to the udev mdadm rule, and I've also 
manually specified the drives in mdadm.conf instead of leaving it on 
auto.  Curious what the next boot will bring.


--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Test

2007-10-25 Thread Daniel L. Miller
Sorry for consuming bandwidth - but all of a sudden I'm not seeing 
messages.  Is this going through?


--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-25 Thread Doug Ledford
On Wed, 2007-10-24 at 16:22 -0400, Bill Davidsen wrote:
 Doug Ledford wrote:
  On Mon, 2007-10-22 at 16:39 -0400, John Stoffel wrote:
 

  I don't agree completely.  I think the superblock location is a key
  issue, because if you have a superblock location which moves depending
  the filesystem or LVM you use to look at the partition (or full disk)
  then you need to be even more careful about how to poke at things.
  
 
  This is the heart of the matter.  When you consider that each file
  system and each volume management stack has a superblock, and they some
  store their superblocks at the end of devices and some at the beginning,
  and they can be stacked, then it becomes next to impossible to make sure
  a stacked setup is never recognized incorrectly under any circumstance.
  It might be possible if you use static device names, but our users
  *long* ago complained very loudly when adding a new disk or removing a
  bad disk caused their setup to fail to boot.  So, along came mount by
  label and auto scans for superblocks.  Once you do that, you *really*
  need all the superblocks at the same end of a device so when you stack
  things, it always works properly.
 Let me be devil's advocate, I noted in another post that location might 
 be raid level dependent. For raid-1 putting the superblock at the end 
 allows the BIOS to treat a single partition as a bootable unit.

This is true for both the 1.0 and 1.2 superblock formats.  The BIOS
couldn't care less if there is an offset to the filesystem because it
doesn't try to read from the filesystem.  It just jumps to the first 512
byte sector and that's it.  Grub/Lilo are the ones that have to know
about the offset, and they would be made aware of the offset at install
time.

So, we are back to the exact same thing I was talking about.  With the
superblock at the beginning of the device, you don't hinder bootability
with or without the raid working, the raid would be bootable regardless
as long as you made it bootable, it only hinders accessing the
filesystem via a running linux installation without bringing up the
raid.

  For all 
 other arrangements the end location puts the superblock where it is 
 slightly more likely to be overwritten, and where it must be moved if 
 the partition grows or whatever.
 
 There really may be no right answer.

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


Test 2

2007-10-25 Thread Daniel L. Miller
Thanks for the test responses - I have re-subscribed...if I see this 
myself...I'm back!

--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test

2007-10-25 Thread Justin Piszcz

Success.

On Thu, 25 Oct 2007, Daniel L. Miller wrote:

Sorry for consuming bandwidth - but all of a sudden I'm not seeing messages. 
Is this going through?


--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test 2

2007-10-25 Thread Justin Piszcz

Success 2.

On Thu, 25 Oct 2007, Daniel L. Miller wrote:

Thanks for the test responses - I have re-subscribed...if I see this 
myself...I'm back!

--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-25 Thread Neil Brown
On Thursday October 25, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
  I certainly accept that the documentation is probably less that
  perfect (by a large margin).  I am more than happy to accept patches
  or concrete suggestions on how to improve that.  I always think it is
  best if a non-developer writes documentation (and a developer reviews
  it) as then it is more likely to address the issues that a
  non-developer will want to read about, and in a way that will make
  sense to a non-developer. (i.e. I'm to close to the subject to write
  good doco).
 
 Patches against what's in 2.6.4 I assume? I can't promise to write 
 anything which pleases even me, but I will take a look at it.

Any text at all would be welcome, but yes; patches against 2.6.4 would
be easiest.

Thanks
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-25 Thread Neil Brown
On Thursday October 25, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
 
  BTW, I don't think your problem has anything to do with the fact that
  you are using whole partitions.

 
 You don't think the unknown partition table on sdd is related? Because 
 I read that as a sure indication that the system isn't considering the 
 drive as one without a partition table, and therefore isn't looking for 
 the superblock on the whole device. And as Doug pointed out, once you 
 decide that there is a partition table lots of things might try to use it.

unknown partition table is what I would expect when using whole
drive.
It just mean the first block doesn't look like a partition table,
and if you have some early block of an ext3 (or other) filesystem in
the first block (as you would in this case), you wouldn't expect it to
look like a partition table.

I don't understand what you are trying to say with your second
sentence.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html