Re: draft howto on making raids for surviving a disk crash

2008-02-07 Thread Luca Berra

On Wed, Feb 06, 2008 at 04:45:39PM +0100, Keld Jørn Simonsen wrote:

On Wed, Feb 06, 2008 at 10:05:58AM +0100, Luca Berra wrote:

On Sat, Feb 02, 2008 at 08:41:31PM +0100, Keld Jørn Simonsen wrote:
Make each of the disks bootable by lilo:

  lilo -b /dev/sda /etc/lilo.conf1
  lilo -b /dev/sdb /etc/lilo.conf2
There should be no need for that.
to achieve the above effect with lilo you use
raid-extra-boot=mbr-only
in lilo.conf

Make each of the disks bootable by grub
install grub with the command
grub-install /dev/md0


I have already changed the text on the wiki. Still I am not convinced it 
is the best advice that is described.



lilo -b /dev/md0 (without a raid-extra-boot line in lilo.conf) will
install lilo on the boot sector of the partitions containing /dev/md0
(and it will break with 1.1 sb)

for grub, do you have any doubt about the grub-install script not
working correctly?

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-06 Thread Luca Berra

On Mon, Feb 04, 2008 at 07:38:40PM +0300, Michael Tokarev wrote:

Eric Sandeen wrote:
[]

http://oss.sgi.com/projects/xfs/faq.html#nulls

and note that recent fixes have been made in this area (also noted in
the faq)

Also - the above all assumes that when a drive says it's written/flushed
data, that it truly has.  Modern write-caching drives can wreak havoc
with any journaling filesystem, so that's one good reason for a UPS.  If


Unfortunately an UPS does not *really* help here.  Because unless
it has control program which properly shuts system down on the loss
of input power, and the battery really has the capacity to power the
system while it's shutting down (anyone tested this?  With new UPS?
and after an year of use, when the battery is not new?), -- unless
the UPS actually has the capacity to shutdown system, it will cut
the power at an unexpected time, while the disk(s) still has dirty
caches...


if the ups is supported by nut (http://www.networkupstools.org) you can
do this easily.
Obviously you should tune the timeout to give your systems enough time
to shutdown in case of power outage, and periodically check your
battery duration (that means real tests) and re-tune the nut software
(and when you discover your battery is dead, change it)

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: draft howto on making raids for surviving a disk crash

2008-02-06 Thread Luca Berra

On Sat, Feb 02, 2008 at 08:41:31PM +0100, Keld Jørn Simonsen wrote:

Make each of the disks bootable by lilo:

  lilo -b /dev/sda /etc/lilo.conf1
  lilo -b /dev/sdb /etc/lilo.conf2

There should be no need for that.
to achieve the above effect with lilo you use
raid-extra-boot=mbr-only
in lilo.conf


Make each of the disks bootable by grub

install grub with the command
grub-install /dev/md0

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-29 Thread Luca Berra

On Sun, Oct 28, 2007 at 08:21:34PM -0400, Bill Davidsen wrote:

Because you didn't stripe align the partition, your bad.
  
Align to /what/ stripe? Hardware (CHS is fiction), software (of the RAID 

the real stripe (track) size of the storage, you must read the manual
and/or bug technical support for that info.
you're about to create), or ??? I don't notice my FC6 or FC7 install 
programs using any special partition location to start, I have only run 
(tried to run) FC8-test3 for the live CD, so I can't say what it might 
do. CentOS4 didn't do anything obvious, either, so unless I really 
misunderstand your position at redhat, that would be your bad.  ;-)


If you mean start a partition on a pseudo-CHS boundary, fdisk seems to 
use what it thinks are cylinders for that.

Yes, fdisk will create partition at sector 63 (due to CHS being braindead,
other than fictional: 63 sectors-per-track)
most arrays use 64 or 128 spt, and array cache are aligned accordingly.
So 63 is almost always the wrong choice.

for the default choice you must consider what spt your array uses, iirc
(this is from memory, so double check these figures)
IBM 64 spt (i think)
EMC DMX 64
EMC CX 128???
HDS (and HP XP) except OPEN-V 96
HDS (and HP XP) OPEN-V 128
HP EVA 4/6/8 with XCS 5.x state that no alignment is needed even if i
never found a technical explanation about that.
previous HP EVA versions did (maybe 64).
you might then want to consider how data is laid out on the storage, but
i believe the storage cache is enough to deal with that issue.

Please note that 0 is always well aligned.

Note to people who is now wondering WTH i am talking about.

consider a storage with 64 spt, an io size of 4k and partition starting
at sector 63.
first io request will require two ios from the storage (1 for sector 63,
and one for sectors 64 to 70)
the next 7 io (71-78,79-86,97-94,95-102,103-110,111-118,119-126) will be
on the same track
the 8th will again require to be split, and so on.
this causes the storage to do 1 unnecessary io every 8. YMMV.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-29 Thread Luca Berra

On Sun, Oct 28, 2007 at 10:59:01PM -0700, Daniel L. Miller wrote:

Doug Ledford wrote:

Anyway, I happen to *like* the idea of using full disk devices, but the
reality is that the md subsystem doesn't have exclusive ownership of the
disks at all times, and without that it really needs to stake a claim on
the space instead of leaving things to chance IMO.
  
I've been re-reading this post numerous times - trying to ignore the 
burgeoning flame war :) - and this last sentence finally clicked with me.



I am sorry Daniel, when i read Doug and Bill, stating that your issue
was not having a partition table, i immediately took the bait and forgot
about your original issue.
I have no reason to believe your problem is due to not having a
partition table on your devices.


sda: unknown partition table

sdb: unknown partition table

sdc: unknown partition table

sdd: unknown partition table

the above clearly shows that the kernel does not see a partition table
where there is none which happens in some cases and bit Doug so hard.
Note, it does not happen at random, it should happen only if you use a
partitioned md device with a superblock at the end. Or if you configure
it wrongly as Doug did. (i am not accusing Doug of being stupid at all,
it is a fairly common mistake to make and we should try to prevent this
in mdadm as much as we can)
Again, having the kernel find a partition table where there is none,
should not pose a problem at all unless there is some badly designed software
like udev/hal that believes it knows better than you about what you have
on your disks.
but _NEITHER OF THESE IS YOUR PROBLEM_ imho

I am also sorry to say that i fail to identify what the source of your
problem is, we should try harder instead of flaming between us.

Is it possible to reproduce it on the live system
e.g. unmount, stop array, start it again and mount.
I bet it will work flawlessly in this case.
then i would disable starting this array at boot, and start it manually
when the system is up (stracing mdadm, so we can see what it does)

I am also wondering about this:
md: md0: raid array is not clean -- starting background reconstruction
does your system shut down properly?
do you see the message about stopping md at the very end of the
reboot/halt process?

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-29 Thread Luca Berra

On Sun, Oct 28, 2007 at 01:47:55PM -0400, Doug Ledford wrote:

On Sun, 2007-10-28 at 15:13 +0100, Luca Berra wrote:

On Sat, Oct 27, 2007 at 08:26:00PM -0400, Doug Ledford wrote:
It was only because I wasn't using mdadm in the initrd and specifying
uuids that it found the right devices to start and ignored the whole
disk devices.  But, when I later made some more devices and went to
update the mdadm.conf file using mdadm -Eb, it found the devices and
added it to the mdadm.conf.  If I hadn't checked it before remaking my
initrd, it would have hosed the system.  And it would have passed all
the above is not clear to me, afair redhat initrd still uses
raidautorun,


RHEL does, but this is on a personal machine I installed Fedora an and
latest Fedora has a mkinitrd that installs mdadm and mdadm.conf and
starts the needed devices using the UUID.  My first sentence above
should have read that I *was* using mdadm.

ah, ok i should look again at fedora's mkinitrd, last one i checked was
6.0.9-1 and i see mdadm was added in 6.0.9-2


 which iirc does not works with recent superblocks,
so you used uuids on kernel command line?
or you use something else for initrd?
why would remaking the initrd break it?


Remaking the initrd installs the new mdadm.conf file, which would have
then contained the whole disk devices and it's UUID.  There in would
have been the problem.

yes, i read the patch, i don't like that code, as i don't like most of
what has been put in mkinitrd from 5.0 onward.
Imho the correct thing here would not have been copying the existing
mdadm.conf but generating a safe one from output of mdadm -D (note -D,
not -E)


the tests you can throw at it.  Quite simply, there is no way to tell
the difference between those two situations with 100% certainty.  Mdadm
tries to be smart and start the newest devices, but Luca's original
suggestion of skip the partition scanning in the kernel and figure it
out from user space would not have shown mdadm the new devices and would
have gotten it wrong every time.
yes, in this particular case it would have, congratulation you found a new
creative way of shooting yourself in the feet.


Creative, not so much.  I just backed out of what I started and tried
something else.  Lots of people do that.


maybe mdadm should do checks when creating a device to prevent this kind
of mistakes.
i.e.
if creating an array on a partition, check the whole device for a
superblock and refuse in case it finds one

if creating an array on a whole device that has a partition table,
either require --force, or check for superblocks in every possible
partition.


What happens if you add the partition table *after* you make the whole
disk device and there are stale superblocks in the partitions?  This
still isn't infallible.

It depends on what you do with that partitioned device *after* having
created the partition table.
- If you try again to run mdadm on it (and the above is implemented it
would fail, and you will be given a chance to wipe the stale sb)
- If you don't and use them as plain devices, _and_ leave the line in
mdadm.conf you will suffer a lot of pain. Since the problem is known and
since fdisk/sfdisk/parted already do a lot of checks on the device, this
could be another useful one.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-29 Thread Luca Berra

On Mon, Oct 29, 2007 at 11:47:19AM -0400, Doug Ledford wrote:

On Mon, 2007-10-29 at 09:18 +0100, Luca Berra wrote:

On Sun, Oct 28, 2007 at 10:59:01PM -0700, Daniel L. Miller wrote:
Doug Ledford wrote:
Anyway, I happen to *like* the idea of using full disk devices, but the
reality is that the md subsystem doesn't have exclusive ownership of the
disks at all times, and without that it really needs to stake a claim on
the space instead of leaving things to chance IMO.
   
I've been re-reading this post numerous times - trying to ignore the 
burgeoning flame war :) - and this last sentence finally clicked with me.


I am sorry Daniel, when i read Doug and Bill, stating that your issue
was not having a partition table, i immediately took the bait and forgot
about your original issue.


I never said *his* issue was lack of partition table, I just said I
don't recommend that because it's flaky.  The last statement I made

maybe i misread you but Bill was quite clear.


about his issue was to ask about whether the problem was happening
during initrd time or sysinit time to try and identify if it was failing
before or after / was mounted to try and determine where the issue might
lay.  Then we got off on the tangent about partitions, and at the same
time Neil started asking about udev, at which point it came out that
he's running ubuntu, and as much as I would like to help, the fact of
the matter is that I've never touched ubuntu and wouldn't have the
faintest clue, so I let Neil handle it.  At which point he found that
the udev scripts in ubuntu are being stupid, and from the looks of it
are the cause of the problem.  So, I've considered the initial issue
root caused for a bit now.

It seems i made an idiot of myself by missing half of the thread, and i
even knew ubuntu was braindead in their use of udev at startup, since a
similar discussion came up on the lvm or the dm-devel mailing list (that
time iirc it was about lvm over multipath)


like udev/hal that believes it knows better than you about what you have
on your disks.
but _NEITHER OF THESE IS YOUR PROBLEM_ imho


Actually, it looks like udev *is* the problem, but not because of
partition tables.

you are right.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-29 Thread Luca Berra

On Mon, Oct 29, 2007 at 11:30:53AM -0400, Doug Ledford wrote:

On Mon, 2007-10-29 at 09:41 +0100, Luca Berra wrote:


Remaking the initrd installs the new mdadm.conf file, which would have
then contained the whole disk devices and it's UUID.  There in would
have been the problem.
yes, i read the patch, i don't like that code, as i don't like most of
what has been put in mkinitrd from 5.0 onward.

in case you wonder i am referring to things like

emit dm create $1 $UUID $(/sbin/dmsetup table $1)


Imho the correct thing here would not have been copying the existing
mdadm.conf but generating a safe one from output of mdadm -D (note -D,
not -E)


I'm not sure I'd want that.  Besides, what makes you say -D is safer
than -E?


mdadm -D  /dev/mdX works on an active md device, so i strongly doubt the 
information
gathered from there would be stale
while mdadm -Es will scan disk devices for md superblock, thus
possibly even finding stale superblocks or leftovers.
I would strongly recommend against blindly doing mdadm -Es 
/etc/mdadm.conf and not supervising the result.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-29 Thread Luca Berra

On Mon, Oct 29, 2007 at 07:05:42PM -0400, Doug Ledford wrote:

And I agree -D has less chance of finding a stale superblock, but it's
also true that it has no chance of finding non-stale superblocks on

Well it might be a matter of personal preference, but i would prefer
an initrd doing just the minumum necessary to mount the root filesystem
(and/or activating resume from a swap device), and leaving all the rest
to initscripts, then an initrd that tries to do everything.


devices that aren't even started.  So, as a method of getting all the
right information in the event of system failure and rescuecd boot, it
leaves something to be desired ;-)  In other words, I'd rather use a
mode that finds everything and lets me remove the stale than a mode that
might miss something.  But, that's a matter of personal choice.

In case of a rescuecd boot, you will probably not have any md devices
activated, and you will probably run mdadm -Es to check what md are
available, the data should be still on the disk, else you would be hosed
anyway.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-28 Thread Luca Berra

On Sat, Oct 27, 2007 at 04:47:30PM -0400, Doug Ledford wrote:

On Sat, 2007-10-27 at 09:50 +0200, Luca Berra wrote:

On Fri, Oct 26, 2007 at 03:26:33PM -0400, Doug Ledford wrote:
On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote:
 On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote:
 The partition table is the single, (mostly) universally recognized
 arbiter of what possible data might be on the disk.  Having a partition
 table may not make mdadm recognize the md superblock any better, but it
 keeps all that other stuff from even trying to access data that it
 doesn't have a need to access and prevents random luck from turning your
 day bad.
 on a pc maybe, but that is 20 years old design.

So?  Unix is 35+ year old design, I suppose you want to switch to Vista
then?
unix is a 35+ year old design that evolved in time, some ideas were
kept, some ditched.


BSD disk labels are still in use, SunOS disk labels are still in use,

i am not a solaris expert, do they still use disk labels under vxvm?
oh, by the way, disklabels do not support the partition type attribute.


partition tables are somewhat on the way out, but only because they are
being replaced by the new EFI disk partitioning method.  The only place
where partitionless devices is common is in dedicated raid boxes where
the raid controller is the only thing that will *ever* see that disk.

well i am more used to other os (HP, AIX) where lvm is the common mean of
accessing disk devices




by default fdisk misalignes partition tables
and aligning them is more complex than just doing without.


So.  You really need to take the time and to understand the alignment of
the device because then and only then can you pass options to mke2fs to

yes and i am not the only person in the world doing that.


Linux works properly with a partition table, so this is a specious
statement.
It should also work properly without one.


Most of the time it does.  But those times where it can fail, the
failure is due to not taking the precautions necessary to prevent it:
aka labeling disk usage via some sort of partition table/disklabel/etc.

I strongly disagree.
the failure is badly designed software.


Did you stick your mmc card in there during the install of the OS?

My laptop has a built-in mmc slot, so i sometimes leave a card plugged
in. But the mmc thing was just an example, it is not that critical.

i don't count myself as a moron, what i am trying to say is that
partition tables are one way of organizing disk space, not the only one.


Using whole disk devices isn't a means of organizing space.  It's a way
to get a rather miniscule amount of space back by *not* organizing the
space.

if i am using, say lvm to organize disk space, a partition table is
unnecessary to the organization, and it is natural not using them.


This whole argument seems to boil down to you wanting to perfectly
optimize your system for your use case which includes controlling the
environment enough that you know it's safe to not partition your disks,
where as I argue that although this works in controlled environments, it
is known to have failure modes in other environments, and I would be
totally remiss if I recommended to my customers that they should take
the risk that you can ignore because of your controlled environment
since I know a lot of my customers *don't* have a controlled environment
such as you do.


The whole argument to me boils down to the fact that not having a partition
table on a device is possible, and software that do not consider this
eventuality is flawed, and recommnding to work-around flawed software is
just digging your head in the sand.
But i believe i did not convince you one ounce more than you convinced
me, so i'll quit this thread which is getting too far.

Regards,
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-28 Thread Luca Berra

On Sat, Oct 27, 2007 at 04:09:03PM -0400, Doug Ledford wrote:

On Sat, 2007-10-27 at 10:00 +0200, Luca Berra wrote:

On Fri, Oct 26, 2007 at 02:52:59PM -0400, Doug Ledford wrote:
On Fri, 2007-10-26 at 11:54 +0200, Luca Berra wrote:
 On Sat, Oct 20, 2007 at 09:11:57AM -0400, Doug Ledford wrote:
 just apply some rules, so if you find a partition table _AND_ an md
 superblock at the end, read both and you can tell if it is an md on a
 partition or a partitioned md raid1 device.

In fact, no you can't.  I know, because I've created a device that had
both but wasn't a raid device.  And it's matching partner still existed
too.  What you are talking about would have misrecognized this
situation, guaranteed.
then just ignore the device and log a warning, instead of doing a random
choice.
L.


It also happened to be my OS drive pair.  Ignoring it would have
rendered the machine unusable.


I wonder what would have happened if it got it wrong

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-28 Thread Luca Berra

On Sat, Oct 27, 2007 at 08:26:00PM -0400, Doug Ledford wrote:

On Sat, 2007-10-27 at 00:30 +0200, Gabor Gombas wrote:

On Fri, Oct 26, 2007 at 02:52:59PM -0400, Doug Ledford wrote:

 In fact, no you can't.  I know, because I've created a device that had
 both but wasn't a raid device.  And it's matching partner still existed
 too.  What you are talking about would have misrecognized this
 situation, guaranteed.

Maybe we need a 2.0 superblock that contains the physical size of every
component, not just the logical size that is used for RAID. That way if
the size read from the superblock does not match the size of the device,
you know that this device should be ignored.


In my case that wouldn't have helped.  What actually happened was I
create a two disk raid1 device using whole devices and a version 1.0
superblock.  I know a version 1.1 wouldn't work because it would be
where the boot sector needed to be, and wasn't sure if a 1.2 would work
either.  Then I tried to make the whole disk raid device a partitioned
device.  This obviously put a partition table right where the BIOS and
the kernel would look for it whether the raid was up or not.  I also

the only reason i can think for the above setup not working is udev
mucking with your device too early.


tried doing an lvm setup to split the raid up into chunks and that
didn't work either.  So, then I redid the partition table and created
individual raid devices from the partitions.  But, I didn't think to
zero the old whole disk superblock.  When I made the individual raid
devices, I used all 1.1 superblocks.  So, when it was all said and done,
I had a bunch of partitions that looked like a valid set of partitions
for the whole disk raid device and a whole disk raid superblock, but I
also had superblocks in each partition with their own bitmaps and so on.

OK


It was only because I wasn't using mdadm in the initrd and specifying
uuids that it found the right devices to start and ignored the whole
disk devices.  But, when I later made some more devices and went to
update the mdadm.conf file using mdadm -Eb, it found the devices and
added it to the mdadm.conf.  If I hadn't checked it before remaking my
initrd, it would have hosed the system.  And it would have passed all

the above is not clear to me, afair redhat initrd still uses
raidautorun, which iirc does not works with recent superblocks,
so you used uuids on kernel command line?
or you use something else for initrd?
why would remaking the initrd break it?


the tests you can throw at it.  Quite simply, there is no way to tell
the difference between those two situations with 100% certainty.  Mdadm
tries to be smart and start the newest devices, but Luca's original
suggestion of skip the partition scanning in the kernel and figure it
out from user space would not have shown mdadm the new devices and would
have gotten it wrong every time.

yes, in this particular case it would have, congratulation you found a new
creative way of shooting yourself in the feet.

maybe mdadm should do checks when creating a device to prevent this kind
of mistakes.
i.e.
if creating an array on a partition, check the whole device for a
superblock and refuse in case it finds one

if creating an array on a whole device that has a partition table,
either require --force, or check for superblocks in every possible
partition.

L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-27 Thread Luca Berra

On Fri, Oct 26, 2007 at 03:26:33PM -0400, Doug Ledford wrote:

On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote:

On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote:
The partition table is the single, (mostly) universally recognized
arbiter of what possible data might be on the disk.  Having a partition
table may not make mdadm recognize the md superblock any better, but it
keeps all that other stuff from even trying to access data that it
doesn't have a need to access and prevents random luck from turning your
day bad.
on a pc maybe, but that is 20 years old design.


So?  Unix is 35+ year old design, I suppose you want to switch to Vista
then?

unix is a 35+ year old design that evolved in time, some ideas were
kept, some ditched.


partition table design is limited because it is still based on C/H/S,
which do not exist anymore.
Put a partition table on a big storage, say a DMX, and enjoy a 20%
performance decrease.


Because you didn't stripe align the partition, your bad.

:)
by default fdisk misalignes partition tables
and aligning them is more complex than just doing without.


Oh, and let's not go into what can happen if you're talking about a dual
boot machine and what Windows might do to the disk if it doesn't think
the disk space is already spoken for by a linux partition.
Why the hell should the existance of windows limit the possibility of
linux working properly.


Linux works properly with a partition table, so this is a specious
statement.

It should also work properly without one.


If i have a pc that dualboots windows i will take care of using the
common denominator of a partition table, if it is my big server i will
probably not. since it won't boot anything else than Linux.


Doesn't really gain you anything, but your choice.  Besides, the
question wasn't why shouldn't Luca Berra use whole disk devices, it
was why I don't recommend using whole disk devices, and my
recommendation wasn't based in the least bit upon a single person's use
scenario.

If i am the only person in the world that believes partition tables
should not be required then i'll shut up.


On the opposite, i once inserted an mmc memory card, which had been
initialized on my mobile phone, into the mmc slot of my laptop, and was
faced with a load of error about mmcblk0 having an invalid partition
table.


So?  The messages are just informative, feel free to ignore them.

but did not anaconda propose to wipe unpartitioned disks?


The phone dictates the format, only a moron would say otherwise.  But,
then again, the phone doesn't care about interoperability and many other
issues on memory cards that it thinks it owns, so only a moron would
argue that because a phone doesn't use a partition table that nothing
else in the computer realm needs to either.

i don't count myself as a moron, what i am trying to say is that
partition tables are one way of organizing disk space, not the only one.


Anyway, I happen to *like* the idea of using full disk devices, but the
reality is that the md subsystem doesn't have exclusive ownership of the
disks at all times, and without that it really needs to stake a claim on
the space instead of leaving things to chance IMO.
Start removing the partition detection code from the blasted kernel and
move it to userspace, which is already in place, but it is not the
default.


Which just moves where the work is done, not what work needs to be done.

and also permits to decide if it hat to be done or not.

It's a change for no benefit and a waste of time.

the waste of time was having to put code in mdadm to undo partition
detection on component devices, where partition detection should not
have taken place.



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-27 Thread Luca Berra

On Fri, Oct 26, 2007 at 06:53:40PM +0200, Gabor Gombas wrote:

On Fri, Oct 26, 2007 at 11:15:13AM +0200, Luca Berra wrote:


on a pc maybe, but that is 20 years old design.
partition table design is limited because it is still based on C/H/S,
which do not exist anymore.


The MS-DOS format is not the only possible partition table layout. Other
formats such as GPT do not have such limitations.


Put a partition table on a big storage, say a DMX, and enjoy a 20%
performance decrease.


I assume your big storage uses some kind of RAID. Are your partitions
stripe-aligned? (Btw. that has nothing to do with partitions, LVM can
also suffer if PEs are not aligned).

mine are, unfortunately the default is to start them at 32256 bytes into
the device.


Oh, and let's not go into what can happen if you're talking about a dual
boot machine and what Windows might do to the disk if it doesn't think
the disk space is already spoken for by a linux partition.

Why the hell should the existance of windows limit the possibility of
linux working properly.

what i am saying is that a dual boot machine is not the only scenario we
have.


On the opposite, i once inserted an mmc memory card, which had been
initialized on my mobile phone, into the mmc slot of my laptop, and was
faced with a load of error about mmcblk0 having an invalid partition
table. Obviously it had none, it was a plain fat filesystem.
Is the solution partitioning it? I don't think the phone would
agree.


Well, it said it could not find a valid partition change. That was the
truth. Why is it a problem if the kernel states a fact?

it is random. reformatting it made the kernel message go away.
i wonder if by chance something would decide it is a valid partition
table

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-27 Thread Luca Berra

On Fri, Oct 26, 2007 at 02:52:59PM -0400, Doug Ledford wrote:

On Fri, 2007-10-26 at 11:54 +0200, Luca Berra wrote:

On Sat, Oct 20, 2007 at 09:11:57AM -0400, Doug Ledford wrote:
just apply some rules, so if you find a partition table _AND_ an md
superblock at the end, read both and you can tell if it is an md on a
partition or a partitioned md raid1 device.


In fact, no you can't.  I know, because I've created a device that had
both but wasn't a raid device.  And it's matching partner still existed
too.  What you are talking about would have misrecognized this
situation, guaranteed.

then just ignore the device and log a warning, instead of doing a random
choice.
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-27 Thread Luca Berra

On Fri, Oct 26, 2007 at 07:06:46PM +0200, Gabor Gombas wrote:

On Fri, Oct 26, 2007 at 06:22:27PM +0200, Gabor Gombas wrote:


You got the ordering wrong. You should get userspace support ready and
accepted _first_, and then you can start the
flamew^H^H^H^H^H^Hdiscussion to make the in-kernel partitioning code
configurable.

sorry, i did not intend to start a flamewar.


Oh wait that is possible even today. So you can build your own kernel
without any partition table format support - problem solved.

yes, i can build my own, i just tought it could be useful for someone
but myself. maybe even Doug's enterprise customers

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-27 Thread Luca Berra

On Sat, Oct 27, 2007 at 12:20:12AM +0200, Gabor Gombas wrote:

On Fri, Oct 26, 2007 at 02:41:56PM -0400, Doug Ledford wrote:


* When using lilo to boot from a raid device, it automatically installs
itself to the mbr, not to the partition.  This can not be changed.  Only
0.90 and 1.0 superblock types are supported because lilo doesn't
understand the offset to the beginning of the fs otherwise.


Huh? I have several machines that boot with LILO and the root is on
RAID1. All install LILO to the boot sector of the mdX device (having
boot=/dev/mdX in lilo.conf), while the MBR is installed by
install-mbr. Since install-mbr has its own prompt that is displayed
before LILO's prompt on boot, I can be pretty sure that LILO did not
write anything to the MBR...


the behaviour is documented in lilo man page, for the
raid-extra-boot option.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid-10 mount at startup always has problem

2007-10-26 Thread Luca Berra

On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote:

partition table (something that the Fedora/RHEL installers do to all
disks without partition tables...well, the installer tells you there's
no partition table and asks if you want to initialize it, but if someone
is in a hurry and hits yes when they meant no, bye bye data).

Cool feature



The partition table is the single, (mostly) universally recognized
arbiter of what possible data might be on the disk.  Having a partition
table may not make mdadm recognize the md superblock any better, but it
keeps all that other stuff from even trying to access data that it
doesn't have a need to access and prevents random luck from turning your
day bad.

on a pc maybe, but that is 20 years old design.
partition table design is limited because it is still based on C/H/S,
which do not exist anymore.
Put a partition table on a big storage, say a DMX, and enjoy a 20%
performance decrease.


Oh, and let's not go into what can happen if you're talking about a dual
boot machine and what Windows might do to the disk if it doesn't think
the disk space is already spoken for by a linux partition.

Why the hell should the existance of windows limit the possibility of
linux working properly.
If i have a pc that dualboots windows i will take care of using the
common denominator of a partition table, if it is my big server i will
probably not. since it won't boot anything else than Linux.


And, in particular with mdadm, I once created a full disk md raid array
on a couple disks, then couldn't get things arranged like I wanted, so I
just partitioned the disks and then created new arrays in the partitions
(without first manually zeroing the superblock for the whole disk
array).  Since I used a version 1.0 superblock on the whole disk array,
and then used version 1.1 superblocks in the partitions, the net result
was that when I ran mdadm -Eb, mdadm would find both the 1.1 and 1.0
superblocks in the last partition on the disk.  Confused both myself and
mdadm for a while.

yes, this is fun
On the opposite, i once inserted an mmc memory card, which had been
initialized on my mobile phone, into the mmc slot of my laptop, and was
faced with a load of error about mmcblk0 having an invalid partition
table. Obviously it had none, it was a plain fat filesystem.
Is the solution partitioning it? I don't think the phone would
agree.


Anyway, I happen to *like* the idea of using full disk devices, but the
reality is that the md subsystem doesn't have exclusive ownership of the
disks at all times, and without that it really needs to stake a claim on
the space instead of leaving things to chance IMO.

Start removing the partition detection code from the blasted kernel and
move it to userspace, which is already in place, but it is not the
default.



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Time to deprecate old RAID formats?

2007-10-26 Thread Luca Berra

On Sat, Oct 20, 2007 at 09:11:57AM -0400, Doug Ledford wrote:

On Sat, 2007-10-20 at 09:53 +0200, Iustin Pop wrote:


Honestly, I don't see how a properly configured system would start
looking at the physical device by mistake. I suppose it's possible, but
I didn't have this issue.


Mount by label support scans all devices in /proc/partitions looking for
the filesystem superblock that has the label you are trying to mount.

it could probably be smarter, but in any case there is no point in
mounting by label an md device.

LVM (unless told not to) scans all devices in /proc/partitions looking

yes, but lvm unless told to, will ignore devices having a valid md
superblock.

for valid LVM superblocks.  In fact, you can't build a linux system that
is resilient to device name changes without doing that.

i dislike labels, especially for devices that contain the os. we should
ensure great care that these are identified correctly, and
mount-by-label does not (usb drive that migrate from one system to
another are so common that you can't ignore them)

you forgot udev ;)

but the fix is easy.
remove the partition detection code from the kernel and start working on
a smart userspace replacement for device detection. we already have
vol_id from udev and blkid from ext3 which support detection of many
device formats.
just apply some rules, so if you find a partition table _AND_ an md
superblock at the end, read both and you can tell if it is an md on a
partition or a partitioned md raid1 device.


And you can with superblock at the front.  You can create a new single
disk raid1 over the existing superblock or you can munge the partition
table to have it point at the start of your data.  There are options,

Please don't do that,
use device-mapper to set the device up, without mucking with partition
tables.

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MD RAID1 performance very different from non-RAID partition

2007-09-18 Thread Luca Berra

On Mon, Sep 17, 2007 at 10:58:11AM -0500, Jordan Russell wrote:

Goswin von Brederlow wrote:

Jordan Russell [EMAIL PROTECTED] writes:

It's an ext3 partition, so I guess that doesn't apply?

I tried remounting /dev/sda2 with the barrier=0 option (which I assume
disables barriers, looking at the source), though, just to see if it
would make any difference, but it didn't; the database build still took
31 minutes.


Compare the read ahead settings.


I'm not sure what you mean. There's a per-mount read ahead setting?



per device

compare
blockdev --getra /dev/sda2
and 
blockdev --getra /dev/md0


L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 resync data direction defined?

2007-07-30 Thread Luca Berra

On Fri, Jul 27, 2007 at 03:07:13PM +0200, Frank van Maarseveen wrote:

I'm experimenting with a live migration of /dev/sda1 using mdadm -B
and network block device as in:

mdadm -B -ayes -n2 -l1 /dev/md1 /dev/sda1 \
--write-mostly -b /tmp/bitm$$ --write-behind /dev/nbd1

not a good idea


/dev/sda1 is to be migrated. During the migration the local system
mounts from /dev/md1 instead. Stracing shows that data flows to the
remote side. But when I do
echo repair /sys/block/md1/md/sync_action

then the data flows in the other direction: the local disk is written
using data read from the remote side.

I believe stracing nbd will give you a partial view of what happens.
anyway in the first case since the second device is write-mostly, all
data is read from local and changes are written to remote
In the second one the data is read from both sides to be compared, that
is what you are seing on strace, i am unsure as to which copy is
considered correct, since md does not have info about that.


If that would happen in the first command then it would destroy all

yes

data instead of migrating it so I wonder if this behavior is defined:

no

Do mdadm --build and mdadm --create always use the first component device
on the command-line as the source for raid1 resync?

no

if you are doing a migration, build the initial array with the second
device as missing
then hot-add it and it will resync correctly
i.e
mdadm -B -ayes -n2 -l1 /dev/md1 /dev/sda1 \
   --write-mostly -b /tmp/bitm$$ --write-behind missing
mdadm -a /dev/md1 /dev/sda1 


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: s2disk and raid

2007-04-06 Thread Luca Berra

On Wed, Apr 04, 2007 at 03:20:56PM +1000, Neil Brown wrote:

The trick is to use the 'start_ro' module parameter.
 echo 1  /sys/module/md_mod/parameters/start_ro

Then md will start arrays assuming read-only.  No resync will be
started, no superblock will be written.  They stay this way until the
first write at which point they become normal read-write and any
required resync starts.


uh, i tought a read-only array was supposed to remain read-only, and
that write attempts would fail.
My bad for not testing my assumptions.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: trouble creating array

2007-02-27 Thread Luca Berra

On Mon, Feb 26, 2007 at 03:40:32AM -0800, jahammonds prost wrote:

Ah ha

# ls -l /sys/block/*/holders/*
lrwxrwxrwx 1 root root 0 Feb 26 06:28 /sys/block/sdb/holders/dm-0 - 
../../../block/dm-0
lrwxrwxrwx 1 root root 0 Feb 26 06:28 /sys/block/sdc/holders/dm-0 - 
../../../block/dm-0

which I am assuming is dmraid? I did a quick check, and


no, it is device-mapper


# dmraid -r
No RAID disks


use dmsetup ls

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md device on Redhat Linux 3

2007-01-31 Thread Luca Berra

On Wed, Jan 31, 2007 at 11:52:17AM +1100, Neil Brown wrote:

On Tuesday January 30, [EMAIL PROTECTED] wrote:

Hi there,

I am currenlty building a site ysing the following

2 * IBM DS4300 (SANS)
2 * IBM X346 (intel systems)
2 * HBA on each node

I am using the md device driver to tie the two SANS together and use them in a 
mirrored environement. So the layout is 3 file systems sitting under and LVM 
volume group siting under a mirrored SAN (/dev/sdb, /dev/sdc) on 2 g.b. fibre 
using md driver.

The problem when we write to one of the SANS when he mirror is
broken and then take this offline and ring the other SAN online and
switch on the RAID array and  varyon the volume group and  finally
mount the file system and then dismount the file system,swithc of
both the volume group and the RAID array and thn reboot the server
it finds the most recently updated disk to be the one written to not
the one last mounted. 


It's not really clear to me what you are trying to do here maybe
if you explain your motivations and expectations.

Usually if you get in the unpleasant situation of having two different
versions of your data you _must_ ensure that automatic resyncronization
does not happen at all. (you might need to manually copy data from one
storage to the other)
maybe mdadm could help, by having some options to control startup of
degraded or unclean arrays.
At the moment the best opion is to have a script that checks
availability of device nodes before starting the array, and refuses to
start if the array would be degraded.
In case you need to force a copy to start you can use mdadm --create
with the missing option to forcibly kick the other storage from the
array.
Also a nice add-on to mdadm would be a command to increase the event
counter of the remaining devices on a degraded array, to ensure those
will be considered to be uptodate at next restart.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problem during boot with linux software raid 5

2006-10-14 Thread Luca Berra

On Fri, Oct 13, 2006 at 06:28:21PM -0400, jeff wrote:

I am running mandrake 2006 linux.

s/mandrake/mandriva


uname -a reports
Linux dual_933 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686
Pentium III (Coppermine) unknown GNU/Linux

not strictly related to your problem, but you should really consider
applying updates from your distribution.

snip


When I rebooted, the reboot hung. I think for some reason it didn't
automatically start the md0 device, and as a result it couldn't mount
the /dev/md0 partition in my /etc/fstab. I went into single-user mode,
and commented out the /dev/md0 line in /etc/fsab, and I was able to
boot. Then I executed the mdadm --create line, uncommented /etc/fstab,
and I was able to access my data.

the command to activate an already existing raid set is mdadm
--assemble, not mdadm --create


I was reading some documentation, and it said that you can use mdadm on
either partitions or on a device (as I did). When you have partitions, I
read that you should set the partition type to 0xFD so they get
autodetected during boot. I can't do this, as I don't have partitions.

this is junk documentation do not believe in it :=)

mandriva boot process uses mdadm to assemble raid devices at boot time
but you need to tell mdadm which arrays it should find at boot by
editing /etc/mdadm.conf, just run the following code snippet:

#!/bin/sh
grep -qs '/^[[:space:]]*DEVICE' /etc/mdadm.conf || \
   echo DEVICE partitions  /etc/mdadm.conf

mdadm -Esc partitions | awk '
   /^ARRAY[[:space:]]/ {
   print $0, auto=yes
   }
'  /etc/mdadm.conf


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ANNOUNCE: mdadm 2.5.4 - A tool for managing Soft RAID under Linux

2006-10-13 Thread Luca Berra

On Fri, Oct 13, 2006 at 10:15:35AM +1000, Neil Brown wrote:


I am pleased to announce the availability of
  mdadm version 2.5.4


it looks like you did not include the patches i posted against 2.5.3



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
* Sat Aug 19 2006 Luca Berra [EMAIL PROTECTED]
- do not fail when autoassembling everything and some md are already active

--- mdadm-2.5.3/mdadm.c 2006-08-19 17:00:51.0 +0200
+++ mdadm-2.5.3/mdadm.c 2006-08-19 16:30:16.0 +0200
@@ -1020,7 +1020,7 @@
}
if (ioctl(mdfd, GET_ARRAY_INFO, array)=0)
/* already assembled, skip */
-   ;
+   cnt++;
else {
rv |= Assemble(ss, array_list-devname, 
mdfd,
   array_list,
--- mdadm-2.5.3/mdopen.c2006-06-26 07:11:00.0 +0200
+++ mdadm-2.5.3/mdopen.c2006-08-19 17:03:24.0 +0200
@@ -166,10 +166,7 @@
}
if (ioctl(mdfd, GET_ARRAY_INFO, array)==0) {
/* already active */
-   close(mdfd);
-   fprintf(stderr, Name : %s is already 
active.\n,
-   dev);
-   return -1;
+   return mdfd;
} else {
if (major != MD_MAJOR  parts  0)
make_parts(dev, parts);
--- mdadm-2.5.3/mdassemble.c.close  2006-06-26 07:11:00.0 +0200
+++ mdadm-2.5.3/mdassemble.c2006-09-13 17:23:15.0 +0200
@@ -91,13 +91,14 @@
rv |= 1;
continue;
}
-   if (ioctl(mdfd, GET_ARRAY_INFO, array)=0)
-   /* already assembled, skip */
-   continue;
-   rv |= Assemble(array_list-st, array_list-devname, 
mdfd,
-  array_list,
-  NULL, NULL,
+   if (ioctl(mdfd, GET_ARRAY_INFO, array)  0) {
+   rv |= Assemble(array_list-st, 
array_list-devname, mdfd,
+  array_list, NULL, NULL,
   readonly, runstop, NULL, NULL, 
verbose, force);
+   } else {
+   rv |= Manage_ro(array_list-devname, mdfd, -1); 
/* make it readwrite */
+   }
+   close(mdfd);
}
return rv;
 }
--- mdadm-2.5.3/Makefile.close  2006-06-20 02:01:17.0 +0200
+++ mdadm-2.5.3/Makefile2006-09-13 17:54:36.0 +0200
@@ -76,7 +76,7 @@
 STATICSRC = pwgr.c
 STATICOBJS = pwgr.o
 
-ASSEMBLE_SRCS := mdassemble.c Assemble.c config.c dlink.c util.c super0.c 
super1.c sha1.c
+ASSEMBLE_SRCS := mdassemble.c Assemble.c Manage.c config.c dlink.c util.c 
super0.c super1.c sha1.c
 ASSEMBLE_FLAGS:= $(CFLAGS) -DMDASSEMBLE
 ifdef MDASSEMBLE_AUTO
 ASSEMBLE_SRCS += mdopen.c mdstat.c
--- mdadm-2.5.3/Manage.c.close  2006-06-26 04:26:07.0 +0200
+++ mdadm-2.5.3/Manage.c2006-09-13 17:25:31.0 +0200
@@ -72,6 +72,8 @@
return 0;   
 }
 
+#ifndef MDASSEMBLE
+
 int Manage_runstop(char *devname, int fd, int runstop, int quiet)
 {
/* Run or stop the array. array must already be configured
@@ -393,3 +395,5 @@
return 0;

 }
+
+#endif /* MDASSEMBLE */
--- mdadm-2.5.3/util.c.close2006-09-13 17:29:19.0 +0200
+++ mdadm-2.5.3/util.c  2006-09-13 18:08:56.0 +0200
@@ -189,6 +189,7 @@
}
 }
 
+#ifndef MDASSEMBLE
 int check_ext2(int fd, char *name)
 {
/*
@@ -286,6 +287,7 @@
fprintf(stderr, Name : assuming 'no'\n);
return 0;
 }
+#endif /* MDASSEMBLE */
 
 char *map_num(mapping_t *map, int num)
 {
@@ -307,7 +309,6 @@
return UnSet;
 }
 
-
 int is_standard(char *dev, int *nump)
 {
/* tests if dev is a standard md dev name.
@@ -482,6 +483,7 @@
return csum;
 }
 
+#ifndef MDASSEMBLE
 char *human_size(long long bytes)
 {
static char buf[30];
@@ -534,7 +536,9 @@
);
return buf;
 }
+#endif /* MDASSEMBLE */
 
+#if !defined(MDASSEMBLE) || defined(MDASSEMBLE)  defined(MDASSEMBLE_AUTO)
 int get_mdp_major(void)
 {
 static int mdp_major = -1;
@@ -618,6 +622,7 @@
if (strncmp(name, /dev/.tmp.md, 12)==0)
unlink(name);
 }
+#endif /* !defined(MDASSEMBLE) || defined(MDASSEMBLE)  
defined(MDASSEMBLE_AUTO

Re: [PATCH] missing close in mdassemble

2006-09-15 Thread Luca Berra

On Wed, Sep 13, 2006 at 04:57:43PM +0200, Luca Berra wrote:

attached, please apply
without this mdassemble cannot activate stacked arrays, i wonder how i
managed to miss it :(


Another patch which obsoletes the previous one
this will make mdassemble, if run a second time, try to make
arrays read-write. Useful if one starts arrays readonly as described in
README.initramfs, after resume fails.

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
--- mdadm-2.5.3/mdassemble.c.close  2006-06-26 07:11:00.0 +0200
+++ mdadm-2.5.3/mdassemble.c2006-09-13 17:23:15.0 +0200
@@ -91,13 +91,14 @@
rv |= 1;
continue;
}
-   if (ioctl(mdfd, GET_ARRAY_INFO, array)=0)
-   /* already assembled, skip */
-   continue;
-   rv |= Assemble(array_list-st, array_list-devname, 
mdfd,
-  array_list,
-  NULL, NULL,
+   if (ioctl(mdfd, GET_ARRAY_INFO, array)  0) {
+   rv |= Assemble(array_list-st, 
array_list-devname, mdfd,
+  array_list, NULL, NULL,
   readonly, runstop, NULL, NULL, 
verbose, force);
+   } else {
+   rv |= Manage_ro(array_list-devname, mdfd, -1); 
/* make it readwrite */
+   }
+   close(mdfd);
}
return rv;
 }
--- mdadm-2.5.3/Makefile.close  2006-06-20 02:01:17.0 +0200
+++ mdadm-2.5.3/Makefile2006-09-13 17:54:36.0 +0200
@@ -76,7 +76,7 @@
 STATICSRC = pwgr.c
 STATICOBJS = pwgr.o
 
-ASSEMBLE_SRCS := mdassemble.c Assemble.c config.c dlink.c util.c super0.c 
super1.c sha1.c
+ASSEMBLE_SRCS := mdassemble.c Assemble.c Manage.c config.c dlink.c util.c 
super0.c super1.c sha1.c
 ASSEMBLE_FLAGS:= $(CFLAGS) -DMDASSEMBLE
 ifdef MDASSEMBLE_AUTO
 ASSEMBLE_SRCS += mdopen.c mdstat.c
--- mdadm-2.5.3/Manage.c.close  2006-06-26 04:26:07.0 +0200
+++ mdadm-2.5.3/Manage.c2006-09-13 17:25:31.0 +0200
@@ -72,6 +72,8 @@
return 0;   
 }
 
+#ifndef MDASSEMBLE
+
 int Manage_runstop(char *devname, int fd, int runstop, int quiet)
 {
/* Run or stop the array. array must already be configured
@@ -393,3 +395,5 @@
return 0;

 }
+
+#endif /* MDASSEMBLE */
--- mdadm-2.5.3/util.c.close2006-09-13 17:29:19.0 +0200
+++ mdadm-2.5.3/util.c  2006-09-13 18:08:56.0 +0200
@@ -189,6 +189,7 @@
}
 }
 
+#ifndef MDASSEMBLE
 int check_ext2(int fd, char *name)
 {
/*
@@ -286,6 +287,7 @@
fprintf(stderr, Name : assuming 'no'\n);
return 0;
 }
+#endif /* MDASSEMBLE */
 
 char *map_num(mapping_t *map, int num)
 {
@@ -307,7 +309,6 @@
return UnSet;
 }
 
-
 int is_standard(char *dev, int *nump)
 {
/* tests if dev is a standard md dev name.
@@ -482,6 +483,7 @@
return csum;
 }
 
+#ifndef MDASSEMBLE
 char *human_size(long long bytes)
 {
static char buf[30];
@@ -534,7 +536,9 @@
);
return buf;
 }
+#endif /* MDASSEMBLE */
 
+#if !defined(MDASSEMBLE) || defined(MDASSEMBLE)  defined(MDASSEMBLE_AUTO)
 int get_mdp_major(void)
 {
 static int mdp_major = -1;
@@ -618,6 +622,7 @@
if (strncmp(name, /dev/.tmp.md, 12)==0)
unlink(name);
 }
+#endif /* !defined(MDASSEMBLE) || defined(MDASSEMBLE)  
defined(MDASSEMBLE_AUTO) */
 
 int dev_open(char *dev, int flags)
 {
--- mdadm-2.5.3/mdassemble.8.close  2006-08-07 03:33:56.0 +0200
+++ mdadm-2.5.3/mdassemble.82006-09-13 18:25:41.0 +0200
@@ -25,6 +25,13 @@
 .B mdassemble
 has the same effect as invoking
 .B mdadm --assemble --scan.
+.PP
+Invoking
+.B mdassemble
+a second time will make all defined arrays readwrite, this is useful if
+using the
+.B start_ro
+module parameter.
 
 .SH OPTIONS
 
@@ -54,6 +61,5 @@
 .PP
 .BR mdadm (8),
 .BR mdadm.conf (5),
-.BR md (4).
-.PP
+.BR md (4),
 .BR diet (1).


Re: RAID5 producing fake partition table on single drive

2006-09-15 Thread Luca Berra

On Fri, Sep 15, 2006 at 05:51:12PM +1000, Lem wrote:

On Thu, 2006-09-14 at 18:42 -0400, Bill Davidsen wrote:

Lem wrote:
On Mon, 2006-09-04 at 13:55 -0400, Bill Davidsen wrote:
May I belatedly say that this is sort-of a kernel issue, since 
/proc/partitions reflects invalid data? Perhaps a boot option like 
nopart=sda,sdb or similar would be in order?




My suggestion was to Neil or other kernel maintainers. If they agree 
that this is worth fixing, the option could be added in the kernel. It 
isn't there now, I was soliciting responses on whether this was desirable.


My mistake, sorry. It sounds like a nice idea, and would work well in
cases where the RAID devices are always assigned the same device names
(sda, sdb, sdc etc), which I'd expect to be the case quite frequently.


that is the issue, quite frequently != always

Unfortunately I see no way to avoid data in the partition table 
location, which looks like a partition table, from being used.


Perhaps an alternative would be to convert an array with
non-partition-based devices to partition-based devices, though I
remember Neil saying this would involve relocating all of the data on
the entire array (perhaps could be done through some funky resync
option?).


sorry, i do not agree
ms-dos partitions are a bad idea, and one i would really love to leave
behind.

what i'd do is move the partition detect code to userspace where it
belongs, togheter with lvm, md, dmraid, multipath and evms 


so what userspace would do is:
check if any wholedisk is one of the above mentioned types
or if it is partitionable.

I believe the order would be something like:
dmraid or multipath
evms (*)
md
lvm
partition table (partx or kpartx)
md
lvm

(*) evms should handle all cases by itself

after each check the device list for the next check should be
recalculated removing devices handled and adding new devices just
created.

this is too much to be done in kernel space, but it can be done easily
in initramfs or initscript. just say Y to CONFIG_PARTITION_ADVANCED
and N to all other CONFIG_?_PARTITION
and code something in userspace.

L.

P.S. the op can simply use partx to remove partition tables from the
components of the md array just after assembling.

L.



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] missing close in mdassemble

2006-09-13 Thread Luca Berra

attached, please apply
without this mdassemble cannot activate stacked arrays, i wonder how i
managed to miss it :(

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
--- mdadm-2.5.3/mdassemble.c.close  2006-09-13 12:28:00.0 +0200
+++ mdadm-2.5.3/mdassemble.c2006-09-13 12:30:24.0 +0200
@@ -91,13 +91,12 @@
rv |= 1;
continue;
}
-   if (ioctl(mdfd, GET_ARRAY_INFO, array)=0)
-   /* already assembled, skip */
-   continue;
+   if (ioctl(mdfd, GET_ARRAY_INFO, array)  0)
rv |= Assemble(array_list-st, array_list-devname, 
mdfd,
   array_list,
   NULL, NULL,
   readonly, runstop, NULL, NULL, 
verbose, force);
+   close(mdfd);
}
return rv;
 }


Re: RAID5 fill up?

2006-09-08 Thread Luca Berra

On Fri, Sep 08, 2006 at 02:26:31PM +0200, Lars Schimmer wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Michael Tokarev wrote:

Lars Schimmer wrote:

Hi!

I´ve got a software RAiD5 with 6 250GB HDs.
Now I changed one disk after another to a 400GB HD and resynced the
raid5 after each change.
Now the RAID5 has got 6 400GB HDs and still uses only 6*250GB space.
How can I grow the md0 device to use 6*400GB?


mdadm --grow is your friend.


Oh, damn, right. I was focussed on --grow to add a new HD to the RAID
But isn´t there a switch to grow to max possible value?
Do I always have to search for the biggest value and type it in by hand?

man mdadm

  -z, --size=

 This value can be set with --grow for RAID level 1/4/5/6. If the
 array  was created with a size smaller than the currently active
 drives, the extra space can be accessed using --grow.  The  size
 can  be given as max which means to choose the largest size that
 fits on all current drives.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-06 Thread Luca Berra

On Tue, Sep 05, 2006 at 05:47:57PM -0400, Steve Cousins wrote:



Luca Berra wrote:


On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:




Benjamin Schieder wrote:


On 05.09.2006 11:03:45, Steve Cousins wrote:

Would people be willing to list their setup? Including such things 
as mdadm.conf file, crontab -l, plus scripts that they use to check 
the smart data and the array, mdadm daemon parameters and anything 
else that is relevant to checking and maintaining an array? 




Personally, I use this script from cron:
http://shellscripts.org/project/hdtest



nice race :)


I'm not sure what you mean?


tmp=`mktemp`
rm -f ${tmp}
touch ${tmp}

the last two lines are unneeded and can be tricked to overwrite
arbitrary filenames


I tried smartctl -t short -d scsi /dev/sdb where /dev/sdb is a 250GB 
SATA drive.


it is '-d ata'

What command do you use for SATA drives?  The sourceforge page implies 
that -d sata doesn't exist yet.  I'm using FC 5 with 2.6.17 kernel and 
smartmontools version 5.33.  Do you have a sample configuration script 
that you could show me?


# monitor two sata disks, show temperature in degrees,
# do a long test every sunday and a short every other day
# at 1am on sda and at 2am on sdb, YMMV
/dev/sda -d ata -a -R 194 -s (L/../../7|S/../../[123456])/01
/dev/sdb -d ata -a -R 194 -s (L/../../7|S/../../[123456])/02

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-06 Thread Luca Berra

On Wed, Sep 06, 2006 at 09:12:24AM +0200, Benjamin Schieder wrote:

Personally, I use this script from cron:
http://shellscripts.org/project/hdtest

nice race :)


As in race condition? Where?


mktemp
rm
touch
why do you do that?


I'm running smartmontools 5.33 here. When did the output change? It still
works fine here.


i retested now with 5.36 and it seems the output did _not_ change, i
don't know what i saw this morning.

but then it errors on the line
IFS=read type status online  ( smartctl -d ata -a ${disk} | grep
\#\ 1 | sed 's,  \+,   ,g' | cut -f 2,3,5 )

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checking md device parity (forced resync) - is it necessary?

2006-09-05 Thread Luca Berra

On Tue, Sep 05, 2006 at 02:00:03PM +0200, Tomasz Chmielewski wrote:

# by default, run at 01:06 on the first Sunday of each month.
6 1 1-7 * 7 root [ -x /usr/share/mdadm/checkarray ]  
/usr/share/mdadm/checkarray --cron --all --quiet


However, it will run at 01:06, on 1st-7th day of each month, and on 
Sundays (Debian etch).

hihihi
monthday and weekday are or-ed in crontab 


L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Luca Berra

On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:



Benjamin Schieder wrote:

On 05.09.2006 11:03:45, Steve Cousins wrote:

Would people be willing to list their setup? Including such things as 
mdadm.conf file, crontab -l, plus scripts that they use to check the 
smart data and the array, mdadm daemon parameters and anything else that 
is relevant to checking and maintaining an array? 



Personally, I use this script from cron:
http://shellscripts.org/project/hdtest


nice race :)


I am checking this out and I see that you are the writer of this script.
I'm getting errors when it comes to lines 76 and 86-90 about the 
arithmetic symbols. This is on a Fedora Core 5 system with bash version 

that is because smartctl output has changed and the grep above returns
no number.

3.1.7(1).   I weeded out the smartctl command and tried it manually with 
no luck on my SATA /dev/sd? drives.


which command?


What do you (or others) recommend for SATA drives?


smartmontools and a recent kernel just work.
also you can schedule smart tests with smartmontools. so you don't need
to cron scripts.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can you IMAGE Mirrored OS Drives?

2006-08-19 Thread Luca Berra

On Sat, Aug 19, 2006 at 03:13:53AM +0200, Gabor Gombas wrote:

but you would have to regenerate the initrd and fight again with lilo :(


Or you can just build a kernel with md built-in and use the kernel-level
RAID autodetection. In situations like this it is _much_ easier and
_much_ more robust than all the initrd-based solutions I have seen.


please, can we try not to resurrect again the kernel-level autodetection
flamewar on this list.

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Imaging Mirrored OS Drives

2006-08-16 Thread Luca Berra

On Wed, Aug 16, 2006 at 01:05:39PM -0400, andy liebman wrote:
if the imaging software is not too smart and creates the partitions and
filesystems with the exact same size as the original, yes.
(i mean that there should be some space between the end of the
filesystem and the end of the partition to store the md superblock)

2) Furthermore, if the above is possible, in creating the arrays on the 
new drives is there a way to force mdadm to give the arrays specific 
UUID numbers? It looks like I can do that with mdadm --update? Should I 
create the arrays first using the normal mdadm -C procedure, and then 
update the UUIDs?

never tried that, let us know how you fare.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Making bootable SATA RAID1 array in Mandriva 2006

2006-08-15 Thread Luca Berra

On Mon, Aug 14, 2006 at 06:50:30AM -0400, andy liebman wrote:
-- I edited fstab and lilo.conf on the the RAID1 / partition so that 
they would refer to /dev/md1

-- I ran chroot on the /dev/md1 partition

did you mount /dev, /proc and /sys before chrooting?
i.e
mount --bind /dev /newroot/dev
mount -t proc /proc /newroot/proc
mount -t sysfs /sys /newroot/sys


Why do I have to do this? I haven't seen this in any recipies. My 

you need it because mkinitrd will need info from /proc and /sys to work
correctly and lilo will try to access /proc/partitions and /dev/md1 if
your /boot is on /dev/md1.


Linux setup only has three partitions:   /, swap, and /home.

/dev /proc and /sys are not disk based filesystems
/dev is a ram disk which is populated at runtime by udev
/proc and /sys are virtual filesystem that expose some of your hw and
kernel configuration to userspace

I'm not sure I understand what you're saying about mounting /dev, /proc 
and /sys.


just run:
mount --bind /dev /newroot/dev
mount -t proc /proc /newroot/proc
mount -t sysfs /sys /newroot/sys
before chrooting

L.
btw, be sure to add auto=yes to the ARRAY lines in /etc/mdadm.conf
or you might find some arrays are not recognized after boot.

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Making bootable SATA RAID1 array in Mandriva 2006

2006-08-15 Thread Luca Berra

On Mon, Aug 14, 2006 at 05:13:47PM +0200, Laurent Lesage wrote:

Hi Andy,

there are options for the mkinitrd command, that are like the 
parameters in mkinitrd.conf (this is the case in Debian). did you 
use the -root=xxx option?


Laurent, 
mkinitrd in debian and mandriva are two completely different beasts,

there is no relation between those two.

besides that, i believe that all Andy has to do for mkinitrd is mounting
/sys on the chroot before running mkinitrd.

there is no mkinitrd.conf on mandriva.
mkinitrd will use fstab to find the root filesystem, so the change Andy
did is correct.

and please, please, stop top-posting and try to quote relevant parts of
messages when answering, or the thread will become unreadable.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Making bootable SATA RAID1 array in Mandriva 2006

2006-08-14 Thread Luca Berra

On Sun, Aug 13, 2006 at 07:51:42PM -0400, andy liebman wrote:
-- I copied the contents of /dev/sda1 (/ partition) and /dev/sda6 (/home 
partition) to /dev/sdb1 and /dev/sdb6 using rsync.

this is not really important, but you should have used the raid devices
as a target.

-- I edited fstab and lilo.conf on the the RAID1 / partition so that 
they would refer to /dev/md1

-- I ran chroot on the /dev/md1 partition

did you mount /dev, /proc and /sys before chrooting?
i.e
mount --bind /dev /newroot/dev
mount -t proc /proc /newroot/proc
mount -t sysfs /sys /newroot/sys

-- I set up an /etc/mdadm.conf file (using mdadm --detail 
--scan/etc/mdadm.conf   -- that's where Mandriva puts it)
-- I added to lilo.conf raid-extra-boot=  and tried both mbr and 
/dev/sda,/dev/sdb

mbr should do

-- I ran mkinitrd and created a new initrd in /boot on /dev/md1.  I got 
an error about not finding the 3w_9xxx driver, but I don't need to load 
that in the initrd anyway so I reran with --builtin=3w_9xxx so that 
mkinitrd would skip that driver that I don't need.


BUT, after all of this, I get a bunch of errors when I try to run lilo:
Fatal: Trying to map files from unnamed device 0x


Regards,
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: resize new raid problem

2006-08-09 Thread Luca Berra

On Wed, Aug 09, 2006 at 01:55:56PM +0400, Serge Torop wrote:

I need to install softw. RAID1 to working RedHat EL4.

I used rescue mode for creating RAID arrays.

...

resize2fs /dev/md2 and see a error messge:

resize2fs 1.39
/resize2fs: relocation error: - /resize2fs: undefined symbol: ext2fs_open2

Can I resolve this problem (resize2fs bug?)?
(may be using mdadm?)


since you bought a commercial product from redhat you might be better
open a support call to them.
if the resize2fs binary you are using comes from EL4, that is.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: let md auto-detect 128+ raid members, fix potential race condition

2006-08-02 Thread Luca Berra

On Tue, Aug 01, 2006 at 05:46:38PM -0300, Alexandre Oliva wrote:

Using the mkinitrd patch that I posted before, the result was that
mdadm did try to bring up all raid devices but, because the raid456
module was not loaded in initrd, the raid devices were left inactive.


probably your initrd is broken, it should not have even tried to bring
up an md array that was not needed to mount root.


Then, when rc.sysinit tried to bring them up with mdadm -A -s, that
did nothing to the inactive devices, since they didn't have to be
assembled.  Adding --run didn't help.

My current work-around is to add raid456 to initrd, but that's ugly.
Scanning /proc/mdstat for inactive devices in rc.sysinit and doing
mdadm --run on them is feasible, but it looks ugly and error-prone.

Would it be reasonable to change mdadm so as to, erhm, disassemble ;-)
the raid devices it tried to bring up but that, for whatever reason,
it couldn't activate?  (say, missing module, not enough members,
whatever)


this would make sense if it were an option, patches welcome :)

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: let md auto-detect 128+ raid members, fix potential race condition

2006-08-02 Thread Luca Berra

On Tue, Aug 01, 2006 at 06:32:33PM -0300, Alexandre Oliva wrote:

Sure enough the LVM subsystem could make things better for one to not
need all of the PVs in the root-containing VG in order to be able to
mount root read-write, or at all, but if you think about it, if initrd

it shouldn't need all of the PVs you just need all the pv where the
rootfs is.


is set up such that you only bring up the devices that hold the actual
root device within the VG and then you change that, say by taking a
snapshot of root, moving it around, growing it, etc, you'd be better
off if you could still boot.  So you do want all of the VG members to
be around, just in case.

in this case just regenerate the initramfs after modifying the vg that
contains root. I am fairly sure that kernel upgrades are far more
frequent than the addirion of PVs to the root VG.


Yes, this is an argument against root on LVM, but there are arguments
*for* root on LVM as well, and there's no reason to not support both
behaviors equally well and let people figure out what works best for
them.


No, this is just an argument against misusing root on lvm.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: host based mirror distance in a fc-based SAN environment

2006-07-26 Thread Luca Berra

On Wed, Jul 26, 2006 at 07:58:09AM +0200, Stefan Majer wrote:

Hi,

im curious if there are some numbers out up to which distance its possible
to mirror (raid1) 2 FC-LUNs. We have 2 datacenters with a effective
distance of 11km. The fabrics in one datacenter are connected to the
fabrics in the other datacenter with 5 dark fibre both about 11km in
distance.


as you probably already know with LX (1310nm) GBICS and single-mode fiber you 
can
reach up to a theoretical limit of 50Km, and you can double that using 1550 nm 
lasers (ZX?)


I want to set up servers wich mirrors their LUNs across the SAN-boxen in
both datacenters. On top of this mirrored LUN i put lvm2.

So the question is does anybody have some numbers up to which distance
this method works ?


the method is independent of the distance, if your FC hardware can do
that, then you can.
the only thing you should consider (and that is not directly related to
distance) is the bandwith you have between the two sites (i mean the
number of systems that might be using those 5 fibers)


Regards,
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: issue with internal bitmaps

2006-07-18 Thread Luca Berra

On Tue, Jul 18, 2006 at 09:34:35AM -0400, Bill Davidsen wrote:
Boy, I didn't say that well... what I meant to suggest is that when -E 
or -X are applied to the array as a whole, would it not be useful to 
itterate them over all of the components rather than than looking for 
non-existant data in the array itself?

the question i believe is to distinguish the case where an md device is
a component of another md device...
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: issue with internal bitmaps

2006-07-07 Thread Luca Berra

On Fri, Jul 07, 2006 at 08:16:18AM +1000, Neil Brown wrote:

On Thursday July 6, [EMAIL PROTECTED] wrote:

hello, i just realized that internal bitmaps do not seem to work
anymore.


I cannot imagine why.  Nothing you have listed show anything wrong
with md...

Maybe you were expecting
  mdadm -X /dev/md100
to do something useful.  Like -E, -X must be applied to a component
device.  Try
  mdadm -X /dev/sda1


/me needs some strong coffe. yes you are right, sorry

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


issue with internal bitmaps

2006-07-06 Thread Luca Berra

hello, i just realized that internal bitmaps do not seem to work
anymore.

kernel 2.6.17
mdadm 2.5.2

[EMAIL PROTECTED] ~]# mdadm --create --level=1 -n 2 -e 1 --bitmap=internal 
/dev/md100 /dev/sda1 /dev/sda2
mdadm: array /dev/md100 started.

... wait awhile ...

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid1]
md100 : active raid1 sda2[1] sda1[0]
 1000424 blocks super 1.0 [2/2] [UU]
 bitmap: 4/4 pages [16KB], 128KB chunk

unused devices: none
[EMAIL PROTECTED] ~]# mdadm -X /dev/md100
   Filename : /dev/md100
  Magic : 
mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
Version : 0
mdadm: unknown bitmap version 0, either the bitmap file is corrupted or you 
need to upgrade your tools

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid1]
md100 : active raid1 sda2[1] sda1[0]
 1000424 blocks super 1.0 [2/2] [UU]
 bitmap: 0/4 pages [0KB], 128KB chunk

unused devices: none

[EMAIL PROTECTED] ~]# mdadm -D /dev/md100
/dev/md100:
   Version : 01.00.03
 Creation Time : Thu Jul  6 16:05:10 2006
Raid Level : raid1
Array Size : 1000424 (977.14 MiB 1024.43 MB)
   Device Size : 1000424 (977.14 MiB 1024.43 MB)
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 100
   Persistence : Superblock is persistent

 Intent Bitmap : Internal

   Update Time : Thu Jul  6 16:07:11 2006
 State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0

  Name : 100
  UUID : 60cd0dcb:fde52377:699453f7:da96b9d4
Events : 1

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   821  active sync   /dev/sda2



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enable auto=yes by default when using udev

2006-07-04 Thread Luca Berra

On Tue, Jul 04, 2006 at 12:46:03AM +0200, Luca Berra wrote:

On Mon, Jul 03, 2006 at 09:14:38AM +1000, Neil Brown wrote:

However


+
+   /* if we are using udev and auto is not set, mdadm will almost
+* certainly fail, so we force it here.
+*/
+   if (autof == 0  access(/dev/.udevdb,F_OK) == 0)
+   autof=2;
+


I'm worried that this test is not very robust.
On my Debian/unstable system running used, there is no
/dev/.udevdb
though there is a
/dev/.udev/db

I guess I could test for both, but then udev might change
again I'd really like a more robust check.


is /dev/.udev/db a debianism?

no it is not

in this case a check for both might suffice, else i will have to think
harder about it.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enable auto=yes by default when using udev

2006-07-03 Thread Luca Berra

On Mon, Jul 03, 2006 at 09:14:38AM +1000, Neil Brown wrote:

However


+
+   /* if we are using udev and auto is not set, mdadm will almost
+* certainly fail, so we force it here.
+*/
+   if (autof == 0  access(/dev/.udevdb,F_OK) == 0)
+   autof=2;
+


I'm worried that this test is not very robust.
On my Debian/unstable system running used, there is no
/dev/.udevdb
though there is a
/dev/.udev/db

I guess I could test for both, but then udev might change
again I'd really like a more robust check.


is /dev/.udev/db a debianism?
in this case a check for both might suffice, else i will have to think
harder about it.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] enable auto=yes by default when using udev

2006-07-02 Thread Luca Berra

Hello,
the following patch aims at solving an issue that is confusing a lot of
users.
when using udev, device files are created only when devices are
registered with the kernel, and md devices are registered only when
started.
mdadm needs the device file _before_ starting the array.
so when using udev you must add --auto=yes to the mdadm commandline or
to the ARRAY line in mdadm.conf

following patch makes auto=yes the default when using udev

L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
* Sat Jun 24 2006 Luca Berra [EMAIL PROTECTED]
- automatically create devices if using udev

--- mdadm-2.5.1/mdadm.c.autof   2006-06-02 01:51:01.0 -0400
+++ mdadm-2.5.1/mdadm.c 2006-06-24 05:17:45.0 -0400
@@ -857,6 +857,13 @@
fputs(Usage, stderr);
exit(2);
}
+
+   /* if we are using udev and auto is not set, mdadm will almost
+* certainly fail, so we force it here.
+*/
+   if (autof == 0  access(/dev/.udevdb,F_OK) == 0)
+   autof=2;
+
/* Ok, got the option parsing out of the way
 * hopefully it's mostly right but there might be some stuff
 * missing
@@ -873,7 +880,7 @@
fprintf(stderr, Name : an md device must be given in 
this mode\n);
exit(2);
}
-   if ((int)ident.super_minor == -2  autof) {
+   if ((int)ident.super_minor == -2  autof  2 ) {
fprintf(stderr, Name : --super-minor=dev is 
incompatible with --auto\n);  
exit(2);
}


Re: [PATCH*2] mdadm works with uClibc from SVN

2006-06-24 Thread Luca Berra

On Fri, Jun 23, 2006 at 08:45:47PM +0100, Nix wrote:

On Fri, 23 Jun 2006, Neil Brown mused:

Is there some #define in an include file which will allow me to tell
if the current uclibc supports ftw or not?


it is not only depending on the uClibc version, but also if ftw support
was compiled in or not.


I misspoke: ftw was split into multiple files in late 2005, but it was
originally added in September 2003, in time for version 0.9.21.

Obviously the #defines in ftw.h don't exist before that date, but
that's a bit late to check, really.

features.h provides the macros __UCLIBC_MAJOR__, __UCLIBC_MINOR__, and
__UCLIBC_SUBLEVEL__: versions above 0.9.20 appear to support ftw()
(at least, they have the function, in 32-bit form at least, which
is certainly enough for this application!)


the following would be the correct check.

#include features.h
#ifdef __UCLIBC_HAS_FTW__
.
#else
.
#endif /* __UCLIBC_HAS_FTW__ */


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failed Hard Disk... help!

2006-06-09 Thread Luca Berra

On Fri, Jun 09, 2006 at 07:44:40PM -0400, David M. Strang wrote:

/Patrick wrote:

pretty sure smartctl -d ata -a /dev/sdwhatever will tell you the
serial number. (Hopefully the kernel is new enough that it supports
SATA/smart, otherwise you need a kernel patch which won't be any 
better...)


Yep... 2.6.15 or better... I need the magical patch =\.

Any other options?


scsi_id from udev, if you are lucky enough

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with raid=noautodetect

2006-05-30 Thread Luca Berra

On Tue, May 30, 2006 at 01:10:24PM -0400, Bill Davidsen wrote:

2) deprecate the DEVICE keyword issuing a warning when it is found in
the configuration file


Not sure I'm so keen on that, at least not in the near term.

Let's not start warning and depreciating powerful features because they 
can be misused... If I wanted someone to make decisions for me I would 
be using this software at all.


you cut the rest of the mail.
i did not propose to deprecate the feature,
just the keyword.

but, ok,
just go on writing 
DEVICE /dev/sda1

DEVICE /dev/sdb1
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1

then come on the list and complain when it stops working.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: adding multipath device without reboot?

2006-05-30 Thread Luca Berra

On Tue, May 30, 2006 at 03:59:33PM +0200, Herta Van den Eynde wrote:
I'm trying to add a new SAN LUN to a system, create a multipath mdadm 
device on it, partition it, and create a new filesystem on it, all 
without taking the system down.


All goes well, up to partitioning the md device:

  # fdisk /dev/md12

wait!
you cannot partition an md device.
if you need you have to use an mdp device,
but do you?
if you just want to create a single filesystem, as you do below, use the
md device directly.

Regards,
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with raid=noautodetect

2006-05-29 Thread Luca Berra

On Mon, May 29, 2006 at 12:38:25PM +0400, Michael Tokarev wrote:

Neil Brown wrote:

On Friday May 26, [EMAIL PROTECTED] wrote:

I'd suggest the following.

All the other devices are included or excluded from the list of devices
to consider based on the last component in the DEVICE line.  Ie. if it
ends up at !dev, all the rest of devices are included.  If it ends up at
dev (w/o !), all the rest are excluded.  If memory serves me right, it's
how squid ACLs works.

There's no need to introduce new keyword.  Given this rule, a line


as i said the new keyword is to warn on configurations that do not
account for changing device-ids, and if we change the syntax a new
keyword would make it clearer. In case the user tries to use a new
configuration on an old mdadm.


The only possible issue I see here is that with udev, it's possible to
use, say, /dev/disk/by-id/*-like stuff (don't remember exact directory
layout) -- symlinked to /dev/sd* according to the disk serial number or
something like that -- for this to work, mdadm needs to use glob()
internally.


uhm
i think that we would better translate any device found on a DEVICE (or
DEVICEFILTER) line to the corresponding major/minor number and blacklist
based on that.
nothing prevents someone to have an udev rule that creates a device
node, instead of symlinking.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)

2006-05-28 Thread Luca Berra

On Fri, May 26, 2006 at 04:33:08PM +1000, Neil Brown wrote:



I am pleased to announce the availability of
  mdadm version 2.5



hello,
i tried rebuilding mdadm 2.5 on current mandriva cooker, which uses
gcc-4.1.1, glibc-2.4 and dietlibc 0.29 and found the following issues
addressed by patches attacched to this message
I would be glad if you could review these patches and include them in
upcoming mdadm releases.

- mdadm-2.3.1-kernel-byteswap-include-fix.patch
reverts a change introduced with mdadm 2.3.1 for redhat compatibility
asm/byteorder.h is an architecture dependent file and does more
stuff than a call to the linux/byteorder/XXX_endian.h
the fact that not calling asm/byteorder.h does not define
__BYTEORDER_HAS_U64__ is just an example of issues that might arise.
if redhat is broken it should be worked around differently than breaking
mdadm.

- mdadm-2.4-snprintf.patch
this is self commenting, just an error in the snprintf call

- mdadm-2.4-strict-aliasing.patch
fix for another srict-aliasing problem, you can typecast a reference to a
void pointer to anything, you cannot typecast a reference to a struct.

- mdadm-2.5-mdassemble.patch
pass CFLAGS to mdassemble build, enabling -Wall -Werror showed some
issues also fixed by the patch.

- mdadm-2.5-rand.patch
Posix dictates rand() versus bsd random() function, and dietlibc
deprecated random(), so switch to srand()/rand() and make everybody
happy.

- mdadm-2.5-unused.patch
glibc 2.4 is pedantic on ignoring return values from fprintf, fwrite and
write, so now we check the rval and actually do something with it.
in the Grow.c case i only print a warning, since i don't think we can do
anithing in case we fail invalidating those superblocks (is should never
happen, but then...)

Regards,
L.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
* Sat Feb 18 2006 Christiaan Welvaart [EMAIL PROTECTED]
not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__
causing __fswab64 to be undefined and failure compiling mdadm on
big_endian architectures like PPC

--- mdadm-2.3.1/mdadm.h.bak 2006-02-06 04:52:12.0 +0100
+++ mdadm-2.3.1/mdadm.h 2006-02-18 03:51:59.786926267 +0100
@@ -72,16 +72,7 @@
 #include   bitmap.h
 
 #include endian.h
-/* #include asm/byteorder.h Redhat don't like this so... */
-#if __BYTE_ORDER == __LITTLE_ENDIAN
-#  include linux/byteorder/little_endian.h
-#elif __BYTE_ORDER == __BIG_ENDIAN
-#  include linux/byteorder/big_endian.h
-#elif __BYTE_ORDER == __PDP_ENDIAN
-#  include linux/byteorder/pdp_endian.h
-#else
-#  error unknown endianness.
-#endif
+#include asm/byteorder.h
 
 
 
* Sat May 27 2006 Luca Berra [EMAIL PROTECTED]
snprintf size should be at most the size of the buffer

--- mdadm-2.4/util.c.snprintf   2006-05-27 13:53:18.0 +0200
+++ mdadm-2.4/util.c2006-05-27 13:53:38.0 +0200
@@ -439,7 +439,7 @@
}
if (create  !std  !nonstd) {
static char buf[30];
-   snprintf(buf, 1024, %d:%d, major, minor);
+   snprintf(buf, 30, %d:%d, major, minor);
nonstd = buf;
}
 
* Sat May 27 2006 Luca Berra [EMAIL PROTECTED]
This is to avoid gcc warnings when building with strict-aliasing optimization

--- mdadm-2.4/dlink.h.alias 2006-05-26 21:05:07.0 +0200
+++ mdadm-2.4/dlink.h   2006-05-27 12:32:58.0 +0200
@@ -4,16 +4,16 @@
 
 struct __dl_head
 {
-struct __dl_head * dh_prev;
-struct __dl_head * dh_next;
+void * dh_prev;
+void * dh_next;
 };
 
 #definedl_alloc(size)  ((void*)(((char*)calloc(1,(size)+sizeof(struct 
__dl_head)))+sizeof(struct __dl_head)))
 #definedl_new(t)   ((t*)dl_alloc(sizeof(t)))
 #definedl_newv(t,n)((t*)dl_alloc(sizeof(t)*n))
 
-#define dl_next(p) *((void**)(((struct __dl_head*)(p))[-1].dh_next))
-#define dl_prev(p) *((void**)(((struct __dl_head*)(p))[-1].dh_prev))
+#define dl_next(p) *struct __dl_head*)(p))[-1].dh_next))
+#define dl_prev(p) *struct __dl_head*)(p))[-1].dh_prev))
 
 void *dl_head(void);
 char *dl_strdup(char *);
* Sat May 27 2006 Luca Berra [EMAIL PROTECTED]
add CFLAGS to mdassemble build and fix a couple of non-returning functions

--- mdadm-2.5/mdadm.h.bluca 2006-05-27 14:25:53.0 +0200
+++ mdadm-2.5/mdadm.h   2006-05-27 15:20:37.0 +0200
@@ -44,10 +44,8 @@
 #include   errno.h
 #include   string.h
 #include   syslog.h
-#ifdef __dietlibc__NONO
-int strncmp(const char *s1, const char *s2, size_t n) __THROW __pure__;
-char *strncpy(char *dest, const char *src, size_t n) __THROW;
-#includestrings.h
+#ifdef __dietlibc__
+#include   strings.h
 #endif
 
 
--- mdadm-2.5/mdassemble.c.bluca2006-05-27 15:11:02.0 +0200
+++ mdadm-2.5/mdassemble.c  2006-05-27 15:15:24.0 +0200
@@ -54,7 +54,7 @@
 };
 
 #ifndef MDASSEMBLE_AUTO
-/* from mdadm.c */
+/* from mdopen.c */
 int

Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)

2006-05-28 Thread Luca Berra

On Sun, May 28, 2006 at 10:08:19AM -0700, dean gaudet wrote:

On Sun, 28 May 2006, Luca Berra wrote:


- mdadm-2.5-rand.patch
Posix dictates rand() versus bsd random() function, and dietlibc
deprecated random(), so switch to srand()/rand() and make everybody
happy.


fwiw... lots of rand()s tend to suck... and RAND_MAX may not be large 
enough for this use.  glibc rand() is the same as random().  do you know 

the fact that glibc rand() is the same implementation as random() was one of the
reason i believe we could switch to rand()


if dietlibc's rand() is good enough?

dietlibc rand() and random() are the same function.
but random will throw a warning saying it is deprecated.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with raid=noautodetect

2006-05-28 Thread Luca Berra

On Mon, May 29, 2006 at 02:21:09PM +1000, Neil Brown wrote:

3) introduce DEVICEFILTER or similar keyword with the same meaning at
the actual DEVICE keyboard


If it has the same meaning, why not leave it called 'DEVICE'???


the idea was to warn people that write

DEVICE /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
ARRAY /dev/md0 ...

that it might break since disk naming is not guaranteed to be constant.


However, there is at least the beginnings of a good idea here.

If we assume there is a list of devices provided by a (possibly
default) 'DEVICE' line, then 


DEVICEFILTER   !pattern1 !pattern2 pattern3 pattern4

could mean that any device in that list which matches pattern 1 or 2
is immediately discarded, and remaining device that matches patterns 3
or 4 are included, and the remainder are discard.

The rule could be that the default is to include any devices that
don't match a !pattern, unless there is a pattern without a '!', in
which case the default is to reject non-accepted patterns.
Is that straight forward enough, or do I need an
 order allow,deny
like apache has?


I think that documenting the feature would be enough
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)

2006-05-28 Thread Luca Berra

On Mon, May 29, 2006 at 12:08:25PM +1000, Neil Brown wrote:

On Sunday May 28, [EMAIL PROTECTED] wrote:
Thanks for the patches.  They are greatly appreciated.

You're welcome


- mdadm-2.3.1-kernel-byteswap-include-fix.patch
reverts a change introduced with mdadm 2.3.1 for redhat compatibility
asm/byteorder.h is an architecture dependent file and does more
stuff than a call to the linux/byteorder/XXX_endian.h
the fact that not calling asm/byteorder.h does not define
__BYTEORDER_HAS_U64__ is just an example of issues that might arise.
if redhat is broken it should be worked around differently than breaking
mdadm.


I don't understand the problem here.  What exactly breaks with the
code currently in 2.5?  mdadm doesn't need __BYTEORDER_HAS_U64__, so
why does not having id defined break anything?
The coomment from the patch says:
 not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__
 causing __fswab64 to be undefined and failure compiling mdadm on
 big_endian architectures like PPC

But again, mdadm doesn't use __fswab64 
More details please.

you use __cpu_to_le64 (ie in super0.c line 987)

   bms-sync_size = __cpu_to_le64(size);

which in byteorder/big_endian.h is defined as

#define __cpu_to_le64(x) ((__force __le64)__swab64((x)))

but __swab64 is defined in byteorder/swab.h (included by
byteorder/big_endian.h) as

#if defined(__GNUC__)  (__GNUC__ = 2)  defined(__OPTIMIZE__)
#  define __swab64(x) \
(__builtin_constant_p((__u64)(x)) ? \
___swab64((x)) : \
__fswab64((x)))
#else
#  define __swab64(x) __fswab64(x)
#endif /* OPTIMIZE */

and __fswab64 is defined further into byteorder/swab.h as

#ifdef __BYTEORDER_HAS_U64__
static __inline__ __attribute_const__ __u64 __fswab64(__u64 x)
.
#endif /* __BYTEORDER_HAS_U64__ */

so building mdadm on a ppc (or i suppose a sparc) will break


now, if you look at /usr/src/linux/asm-*/byteorder.h you will notice
they are very different files, this makes me believe it is not a good
idea bypassing asm/byteorder.h
And no, just defining __BYTEORDER_HAS_U64__ will break on 32bit
big-endian cpus (and if i do not misread it might just compile and give
incorrect results)



- mdadm-2.4-snprintf.patch
this is self commenting, just an error in the snprintf call


I wonder how that snuck in...
There was an odd extra tab in the patch, but no-matter.
I changed it to use 'sizeof(buf)' to be consistent with other uses
of snprint.  Thanks.

yes, that would be better.


- mdadm-2.4-strict-aliasing.patch
fix for another srict-aliasing problem, you can typecast a reference to a
void pointer to anything, you cannot typecast a reference to a
struct.


Why can't I typecast a reference to a struct??? It seems very
unfair...

that's strict-aliasing optimization for you, i do agree it _is_ unfair.


However I have no problem with the patch.  Applied.  Thanks.
I should really change it to use 'list.h' type lists from the linux
kernel.

hopefull redhat would not object :)


- mdadm-2.5-mdassemble.patch
pass CFLAGS to mdassemble build, enabling -Wall -Werror showed some
issues also fixed by the patch.


yep, thanks.



- mdadm-2.5-rand.patch
Posix dictates rand() versus bsd random() function, and dietlibc
deprecated random(), so switch to srand()/rand() and make everybody
happy.


Everybody?
'man 3 rand' tells me:

  Do not use this function in applications  intended  to  be
  portable when good randomness is needed.

Admittedly mdadm doesn't need to be portable - it only needs to run on
Linux.  But this line in the man page bothers me.

I guess
   -Drandom=rand -Dsrandom=srand
might work no.  stdlib.h doesn't like that.
'random' returns 'long int' while rand returns 'int'.
Interestingly 'random_r' returns 'int' as does 'rand_r'.

#ifdef __dietlibc__
#includestrings.h
/* dietlibc has deprecated random and srandom!! */
#define random rand
#define srandom srand
#endif

in mdadm.h.  Will that do you?



yes, mdassemble will build, so it is ok for me.




- mdadm-2.5-unused.patch
glibc 2.4 is pedantic on ignoring return values from fprintf, fwrite and
write, so now we check the rval and actually do something with it.
in the Grow.c case i only print a warning, since i don't think we can do
anithing in case we fail invalidating those superblocks (is should never
happen, but then...)


Ok, thanks.


You can see these patches at
  http://neil.brown.name/cgi-bin/gitweb.cgi?p=mdadm


Thanks.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 kicks non-fresh drives

2006-05-27 Thread Luca Berra

On Fri, May 26, 2006 at 03:37:29PM -0400, Mark Hahn wrote:

i strongly believe it is not correct to let kernel auto-assemble devices
kernel auto-assembly should be disable and activation should be handled
by mdadm only!


it's a convenience/safety tradeoff, like so many other cases.
without kernel auto-assembly, it's somewhat more annoying to 
boot onto MD raid, right?  you are forced to put MD config stuff

into your initrd, etc.

yes, it is, but initrd are generated by scripts nowadays, so you wont
even notice.

I don't see why auto-assembly is such a bad thing.  it means you 

please read the list archives, it has been explained to boredom

the only argument I see against (kernel) auto-assembly is the 
general principle of moving things out of the kernel where possible.

but that's not a hard/fast rule anyway, so...

please read the list archives, it has been explained to boredom

Regards,
L.

and please, 
do not To: or Cc: me, i do actively read the list.


L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] install a static build

2006-05-27 Thread Luca Berra

On Sat, May 27, 2006 at 12:08:19PM +0200, Dirk Jagdmann wrote:

Hello developers,

after building my static mdadm I wanted to install it via make
install, but had to tweak the Makefile a little. However my approach
is not the best, since I had to remove the rmconf target for the
static build, issue a cp of the mdadm.static binary and my patch
might interfere with the everything target :-(


maybe you better add an install-static target.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] fix static build

2006-05-27 Thread Luca Berra

On Sat, May 27, 2006 at 12:05:56PM +0200, Dirk Jagdmann wrote:

Hello developers and patch reviewers,

I just tried to update my (old) mdadm to 2.5 and had to apply this
small patch to build a static linked mdadm.

you should also fix the clean target.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with raid=noautodetect

2006-05-26 Thread Luca Berra

On Tue, May 23, 2006 at 08:39:26AM +1000, Neil Brown wrote:

Presumably you have a 'DEVICE' line in mdadm.conf too?  What is it.
My first guess is that it isn't listing /dev/sdd? somehow.


Neil,
i am seeing a lot of people that fall in this same error, and i would
propose a way of avoiding this problem

1) make DEVICE partitions the default if no device line is specified.
2) deprecate the DEVICE keyword issuing a warning when it is found in
the configuration file
3) introduce DEVICEFILTER or similar keyword with the same meaning at
the actual DEVICE keyboard
4) optionally add an EXCLUDEDEVICE keyword with the opposite meaning.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problems with raid=noautodetect

2006-05-26 Thread Luca Berra

On Fri, May 26, 2006 at 09:53:08AM +0200, Luca Berra wrote:

On Tue, May 23, 2006 at 08:39:26AM +1000, Neil Brown wrote:

Presumably you have a 'DEVICE' line in mdadm.conf too?  What is it.
My first guess is that it isn't listing /dev/sdd? somehow.


Neil,
i am seeing a lot of people that fall in this same error, and i would
propose a way of avoiding this problem

1) make DEVICE partitions the default if no device line is specified.

oops,
just read your 2.5 announce, you already did that :)

2) deprecate the DEVICE keyword issuing a warning when it is found in
the configuration file
3) introduce DEVICEFILTER or similar keyword with the same meaning at
the actual DEVICE keyboard
4) optionally add an EXCLUDEDEVICE keyword with the opposite meaning.



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 kicks non-fresh drives

2006-05-26 Thread Luca Berra

On Fri, May 26, 2006 at 11:06:21AM -0600, Craig Hollabaugh wrote:

On Fri, 2006-05-26 at 12:45 -0400, Mark Hahn wrote:
I think the current situation is good, since there is some danger of 
going too far.  for instance, testing each partition to see whether 
it contains a valid superblock would be pretty crazy, right?  requiring
either the auto-assemble-me partition type, or explicit partitions 
given in a config file is a happy medium...




I created my array in 1/2003, don't know versions of kernel or mdadm I
was using then.

In my situation over the past few days.
 kernel 2.4.30 kicked non-fresh
 kernel 2.6.11.8 kicked non-fresh
 kernel 2.6.18.8 didn't mention anything, just skipped my 'linux'
partitions

These kernels auto-assemble prior to mounting /. So the kernel doesn't
consult my 
/etc/mdadm/mdadm.conf file. Is this correct? 

i strongly believe it is not correct to let kernel auto-assemble devices
kernel auto-assembly should be disable and activation should be handled
by mdadm only!

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: strange RAID5 problem

2006-05-09 Thread Luca Berra

On Mon, May 08, 2006 at 11:30:52PM -0600, Maurice Hilarius wrote:

[EMAIL PROTECTED] ~]# mdadm /dev/md3 -a /dev/sdw1

But, I get this error message:
mdadm: hot add failed for /dev/sdw1: No such device

What? We just made the partition on sdw a moment ago in fdisk. It IS there!


I don't believe you, prove it (/proc/partitions)


So. we look around a bit:
# /cat/proc/mdstat

md3 : inactive sdq1[0] sdaf1[15] sdae1[14] sdad1[13] sdac1[12] sdab1[11]
sdaa1[10] sdz1[9] sdy1[8] sdx1[7] sdv1[5] sdu1[4] sdt1[3] sds1[2]
sdr1[1]
 5860631040 blocks

Yup, that looks correct, missing sdw1[6]


no, it does not, it is 'inactive'


[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid1] [raid5]

...

md3 : inactive sdq1[0] sdaf1[15] sdae1[14] sdad1[13] sdac1[12] sdab1[11]
sdaa1[10] sdz1[9] sdy1[8] sdx1[7] sdv1[5] sdu1[4] sdt1[3] sds1[2]
sdr1[1]
 5860631040 blocks

...

[EMAIL PROTECTED] ~]# mdadm /dev/md3 -a /dev/sdw1
mdadm: hot add failed for /dev/sdw1: No such device

OK, let's mount the degraded RAID and try to copy the files to somewhere
else, so we can make it from scratch:

[EMAIL PROTECTED] ~]# mount /dev/md3 /all/boxw16/
/dev/md3: Invalid argument
mount: /dev/md3: can't read superblock


it is still inactive, no wonder you cannot access it.

try running the array, or really stop it before assembling.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: strange RAID5 problem

2006-05-09 Thread Luca Berra

On Tue, May 09, 2006 at 10:16:25AM -0600, Maurice Hilarius wrote:

Luca Berra wrote:

On Mon, May 08, 2006 at 11:30:52PM -0600, Maurice Hilarius wrote:

[EMAIL PROTECTED] ~]# mdadm /dev/md3 -a /dev/sdw1

But, I get this error message:
mdadm: hot add failed for /dev/sdw1: No such device

What? We just made the partition on sdw a moment ago in fdisk. It IS
there!


I don't believe you, prove it (/proc/partitions)



I understand. Here we go then. Devices in question bracketed with **:


ok, now i do.
is the /dev/sdw1 device file correctly created?
you could try straceing mdadm to see what happens

what about the other suggestion? trying to stop the array and restart
it, since it is marked as inactive.
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slackware -current softraid5 boot problem

2006-05-08 Thread Luca Berra

On Tue, May 09, 2006 at 12:06:32AM +0200, Dexter Filmore wrote:
Finally got me a bunch of disks as a raid 5, I don't even want to boot from 
it, plain data array. four sata-II samsungs on a sil3114 controller, all fine 
so far.


Booting the machine I get no superblock on /dev/sdd, stopping 
assembly (or assembly aborted or so)
After boot has completed, running mdadm --assemble --scan --verbose manually 
works just fine and doesn't tell a thing about missing superblocks. 


you don't give a lot of information about your setup,
in any case it could be something like udev and the /dev/sdd device node
not being available at boot?

Regards,
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EVMS causing problems with mdadm?

2006-04-23 Thread Luca Berra

On Mon, Apr 24, 2006 at 07:48:00AM +1000, Neil Brown wrote:

On Sunday April 23, [EMAIL PROTECTED] wrote:

Did my latest updates for my Kubuntu (Ubuntu KDE variant) this
morning, and noticed that EVMS has now taken control of my RAID
array. Didn't think much about it until I tried to make a RAID-1 array
with two disks I've just added to the system. Trying to do a create
verbose tells me that device /dev/md1 (or 2 or 3 - I tried a couple
just to see) doesn't exist. And in fact, there are no block devices
listed beyond md0.


Sounds like udev is in use rather than a static /dev.

Add '--auto=md' to the mdadm command line, and it will create the
devices for you.  Or --auto=part if you want partitioned arrays.  See
man page for more details.

I suspect this might need to be come the default in another year or so


i was tkinking about stat()ing /dev/.udev and automatically enabling
--auto if found

WDYT?

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RAID] forcing a read on a known bad block

2006-04-11 Thread Luca Berra

On Tue, Apr 11, 2006 at 12:37:53PM -1000, Julian Cowley wrote:

On Tue, 11 Apr 2006, dean gaudet wrote:

anyhow this made me wonder if there's some other existing trick to force
such reads/reconstructions to occur... or perhaps this might be a useful
future feature.


For testing RAID, what would be really nice is if there were a virtual
disk device where one could simulate bad sectors (read or write),
non-responsive disks, etc.  It would be virtual in the same sort way
that /dev/full simulates a full disk.


either use the MD faulty personality, or the device-mapper error
target.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md/mdadm fails to properly run on 2.6.15 after upgrading from 2.6.11

2006-04-09 Thread Luca Berra

On Sun, Apr 09, 2006 at 02:35:53PM +0200, Marc L. de Bruin wrote:

Hi,

(I just subscribed, sorry if this is a dupe. I did try to match the 
subject from the archives, but couldn't find any...)


I ran into trouble after upgrading a Debian Sarge system from 2.6.11 to 
2.6.15. To be more precise, it turned out that md/mdadm seems to not 
function properly during the boot process of 2.6.15.


My /etc/mdadm/mdadm.conf contains this:

---[mdadm.conf]---
DEVICE /dev/hdi1 /dev/hdg1 /dev/hdc1
ARRAY /dev/md1 level=raid5 num-devices=3 
UUID=09c58ab6:f706e37b:504cf890:1a597046 
devices=/dev/hdi1,/dev/hdg1,/dev/hdc1


DEVICE /dev/hdg2 /dev/hdc2
ARRAY /dev/md2 level=raid1 num-devices=2 
UUID=86210844:6abbf533:dc82f982:fe417066 devices=/dev/hdg2,/dev/hdc2


DEVICE /dev/hda2 /dev/hdb2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=da619c37:6c072dc8:52e45423:f4a58b7c devices=/dev/hda2,/dev/hdb2


DEVICE /dev/hda1 /dev/hdb1
ARRAY /dev/md4 level=raid1 num-devices=2 
UUID=bfc30f9b:d2c21677:c4ae5f90:b2bddb75 devices=/dev/hda1,/dev/hdb1


DEVICE /dev/hdc3 /dev/hdg3
ARRAY /dev/md3 level=raid1 num-devices=2 
UUID=fced78ce:54f00a78:8662e7eb:2ad01d0b devices=/dev/hdc3,/dev/hdg3

---[/mdadm.conf]---


replace all
DEVICE .
lines
with a single

DEVICE partitions
remove all the device=... part from the array lines.

L.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bitmaps Kernel Versions

2006-03-15 Thread Luca Berra

On Wed, Mar 15, 2006 at 09:08:17PM +1100, Neil Brown wrote:

On Wednesday March 15, [EMAIL PROTECTED] wrote:

Hi,

I'm planning to use bitmaps on some of our RAID1 arrays.

I'm wondering how bitmaps are handeled by older kernels.

Eg: I create a raid array with a bitmap under a 2.6.15 kernel.

I now want to boot under 2.6.12, or even 2.4


Hos is it handeled?
Will it work even if this is my / partition?


On older kernel will not notice the bitmap and will behave
'normally'. 


strange,
last time i tried an older kernel would refuse to activate an md with a
bitmap on it.
I am far from home on a business trip and i don't have kernel-sources at
hand, but i seem to remember that the kernel was very strict on the
feature bitmap in the superblock.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 wont restart after disk failure, then corrupts

2006-03-01 Thread Luca Berra

On Tue, Feb 28, 2006 at 10:08:11PM +, Chris Allen wrote:


Yesterday morning we had an io error on /dev/sdd1:

Feb 27 10:08:57 snap25 kernel: SCSI error : 0 0 3 0 return code = 0x1
Feb 27 10:08:57 snap25 kernel: end_request: I/O error, dev sdd, sector 50504271
Feb 27 10:08:57 snap25 kernel: raid5: Disk failure on sdd1, disabling device. 
Operation continuing on 7 devices

So, I shutdown the system and replaced drive sdd with a new one. 
When I powered up again, all was not well. The array wouldn't start:


Feb 27 13:36:02 snap25 kernel: md: md0: raid array is not clean -- starting 
background reconstruction



Feb 27 13:36:02 snap25 kernel: raid5: cannot start dirty degraded array for md0


something happened whan you shut down the system and the superblock on
the drives was not updated


I tried assembling the array with --force, but this would produce exactly the
same results as above - the array would refuse to start.

QUESTION: What should I have done here? Each time I have tried this in the 
past, I

recreate the array with a missing drive in place of sdd.
mount your fs readonly (as ext2 in case it was ext3) and verify that all
data is readable.


have had no problems restarting the array and adding the new disk. What had gone
wrong, and why wouldn't the array start?

something happened whan you shut down the system and the superblock on
the drives was not updated


Then things went from bad to worse.


===
PROBLEM 2 - DATA CORRUPTION
===


1. Any idea what had happened here? Why didn't it notice that sdd1 was stale? 

something happened whan you shut down the system and the superblock on
the drives was not updated


2. If I had let it complete its resync would it have sorted out the corruption? 

no

Or would it have made things worse?

possibly yes

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bigendian issue with mdadm

2006-02-20 Thread Luca Berra

On Tue, Feb 21, 2006 at 10:44:22AM +1100, Neil Brown wrote:

On Monday February 20, [EMAIL PROTECTED] wrote:

Hi All,

Please, Help !

I've created a raid5 array on a x86 platform, and now wish to use it
on a mac mini (g4 based). But the problem is : the first is
little-endian, the second big-endian...
And it seams like md superblock disk format is hostendian, so how
should I say mdadm to use a endianness ?



Read the man page several times?

Look for --update=byteorder

You need mdadm-2.0 or later.


besides IIRC version 1 super block is always little-endan.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: quota on raid

2006-02-18 Thread Luca Berra

On Sat, Feb 18, 2006 at 02:12:26AM +0100, Antonello PAPA wrote:
i have two disk with raid 1. i can't find documents on how to set the 
quotas.
i whould like to set the quota on /home, can some opne help? thanks and 
sorry for bad english.
antonello 


in italian:
http://www.lnf.infn.it/computing/doc/AppuntiLinux/a296.html

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: quota on raid

2006-02-18 Thread Luca Berra

On Sat, Feb 18, 2006 at 09:42:04AM +0100, Luca Berra wrote:

On Sat, Feb 18, 2006 at 02:12:26AM +0100, Antonello PAPA wrote:
i have two disk with raid 1. i can't find documents on how to set the 
quotas.
i whould like to set the quota on /home, can some opne help? thanks and 
sorry for bad english.
antonello 


in italian:
http://www.lnf.infn.it/computing/doc/AppuntiLinux/a296.html


sorry,
this is the _current_ version
http://a2.swlibero.org/a2111.htm

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: quota on raid

2006-02-18 Thread Luca Berra

On Sat, Feb 18, 2006 at 02:30:47PM +0100, Antonello PAPA wrote:

At 09.49 18/02/06, you wrote:

On Sat, Feb 18, 2006 at 09:42:04AM +0100, Luca Berra wrote:

On Sat, Feb 18, 2006 at 02:12:26AM +0100, Antonello PAPA wrote:
i have two disk with raid 1. i can't find documents on how to set the 
quotas.
i whould like to set the quota on /home, can some opne help? thanks and 
sorry for bad english.

antonello


in italian:
http://www.lnf.infn.it/computing/doc/AppuntiLinux/a296.html

sorry,
this is the _current_ version
http://a2.swlibero.org/a2111.htm

i have used that document but it doesn't work. This is my fstab:

/dev/md1 /   ext3defaults1 1
/dev/md0 /bootext3defaults1 2
/dev/devpts  /dev/pts   devpts  gid=5,mode=620  0 0
/dev/shm  /dev/shm  tmpfs   defaults0 0
/dev/md5   /home   ext3, usrquota, grpquota defaults1 2


which is _not_ what the document instructs you to.
read it again.

in any case, configuring disk quotas is not the object of this mailing
list. it does not matter if you want to configure quotas on raid, quotas
are a filesystem feature, not a block device feature.

that said, you don't give any hint about your problem.
please take some time to read http://xoomer.virgilio.it/army1987k
before posting again.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2+ raid sets, sata and a missing hd question

2006-02-16 Thread Luca Berra

On Wed, Feb 15, 2006 at 06:44:29PM -0500, Mark Hahn wrote:

hint
echo DEVICE partitions  /etc/mdadm.conf
mdadm -Esc partitions | grep ARRAY /etc/mdadm.conf

All md partitions are of type fd (Linux raid autodetect).

this is surprisingly not at all relevant


I'm not sure about that.  I've accidentally shuffled disks before,
and 0xfd resulted in all the MD's coming to life again.  I never 
use /etc/mdadm.conf, since there doesn't seem to be much point...


i did not speak about in-kernel autodetection.

don't cc me on list replies unless it is really important

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2+ raid sets, sata and a missing hd question

2006-02-14 Thread Luca Berra

On Wed, Feb 15, 2006 at 01:45:21PM +1100, CaT wrote:

Seeing as how SATA drives can move around if one removes one from a set
(ie given sda, sdb, sdc, if sdb was removed sdc drops to sdb) would md6
come back up without problems if I were to remove either sda or sdb.


if you configured mdadm correctly, you will have no problem :)

hint
echo DEVICE partitions  /etc/mdadm.conf
mdadm -Esc partitions | grep ARRAY /etc/mdadm.conf


All md partitions are of type fd (Linux raid autodetect).


this is surprisingly not at all relevant

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question: array locking, possible?

2006-02-13 Thread Luca Berra

On Mon, Feb 13, 2006 at 06:52:47PM +0100, Chris Osicki wrote:


Luca

On Thu, 9 Feb 2006 21:48:48 +0100
Luca Berra [EMAIL PROTECTED] wrote:


On Thu, Feb 09, 2006 at 10:28:58AM -0800, Stern, Rick (Serviceguard Linux) 
wrote:
There is more interest, just not vocal.

May want to look at LVM2 and its ability to use tagging to control enablement 
of VGs. This way it is not HW dependent.

I believe there is space in md1 superblock for a cluster/exclusive
flag, if not the name field could be used


Great if there is space for it there is a hope.
Unfortunately I don't think my programming skills are up to
such a task as making proof-of-concept patches.


i was thinking of adding a bit in the feature_map flags to enable this
kind of behaviour, the downside of it is that kernel space code has to
be updated to account for this flags, as it is for anything in the
superblock except for name.

Neil, what would you think of reserving some more space in the superblock for
other data which can be used from user-space?

i believe playing with name is a kludge.


what is missing is an interface between mdadm and cmcld so mdadm can ask
cmcld permission to activate an array with the cluster/exclusive flag
set.


For the time being we could live without it. I'm convinced HP would
make use of it once it's there.


i was thinking something like a socket based interface between mdadm and
a generic cluster daemon, non necessarily cmcld.


And I wouldn't say mdadm should get permission from cmcld (for those
who don't know Service Guard cluster software from HP: cmcld is
the Cluster daemon). IMHO cmcld should clear the flag on the array
when initiating a fail-over in case the host which used it crashed.

no, i don't like the flag to be cleared, there is too much space for a
race. The flag should be permanent (unless it is forcibly removed with
mdadm --grow).


Once again, what I would like it for is for preventing two hosts writing
the array at the same time because I accidentally activated it.
Without cmcld's awareness of the cluster/exclusive flag I would
always run mdadm with the '--force' option to enable the array during
package startup, because if I trust the cluster software I know the
fail-over is happening because the other node crashed or it is a
manual (clean) fail-over. 


if you only want this, it could be entirely implemented into mdadm, just
adding a exclusive flag to the ARRAY line in mdadm.conf
this is not foolproof, as it will only prevent mdadm -As from assembling
a device, providing the identification information on the command line
or running something like mdadm -Asc partitions, to fool it.


--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
diff -urN mdadm-2.3.1/Assemble.c mdadm-2.3.1.exclusive/Assemble.c
--- mdadm-2.3.1/Assemble.c  2006-01-25 08:01:10.0 +0100
+++ mdadm-2.3.1.exclusive/Assemble.c2006-02-13 22:48:04.0 +0100
@@ -34,7 +34,7 @@
 mddev_dev_t devlist,
 int readonly, int runstop,
 char *update,
-int verbose, int force)
+int verbose, int force, int exclusive)
 {
/*
 * The task of Assemble is to find a collection of
@@ -255,6 +255,15 @@
continue;
}
 
+   if (ident-exclusive != UnSet 
+   !exclusive ) {
+   if ((inargv  verbose = 0) || verbose  0)
+   fprintf(stderr, Name : %s can be activated in 
exclusive mode only.\n,
+   devname);
+   continue;
+   }
+
+
/* If we are this far, then we are commited to this device.
 * If the super_block doesn't exist, or doesn't match others,
 * then we cannot continue
diff -urN mdadm-2.3.1/ReadMe.c mdadm-2.3.1.exclusive/ReadMe.c
--- mdadm-2.3.1/ReadMe.c2006-02-06 05:09:35.0 +0100
+++ mdadm-2.3.1.exclusive/ReadMe.c  2006-02-13 22:27:26.0 +0100
@@ -147,6 +147,7 @@
 {scan,  0, 0, 's'},
 {force,0, 0, 'f'},
 {update,   1, 0, 'U'},
+{exclusive, 0, 0, 'x'},
 
 /* Management */
 {add,   0, 0, 'a'},
diff -urN mdadm-2.3.1/config.c mdadm-2.3.1.exclusive/config.c
--- mdadm-2.3.1/config.c2005-12-09 06:00:47.0 +0100
+++ mdadm-2.3.1.exclusive/config.c  2006-02-13 22:23:02.0 +0100
@@ -286,6 +286,7 @@
mis.st = NULL;
mis.bitmap_fd = -1;
mis.name[0] = 0;
+   mis.exclusive = 0;
 
for (w=dl_next(line); w!=line; w=dl_next(w)) {
if (w[0] == '/') {
@@ -386,6 +387,8 @@
fprintf(stderr, Name : auto type of 
\%s\ ignored for %s\n,
w+5, 
mis.devname?mis.devname:unlabeled-array);
}
+   } else if (strncasecmp(w

Re: Question: array locking, possible?

2006-02-13 Thread Luca Berra

On Mon, Feb 13, 2006 at 10:53:43PM +0100, Luca Berra wrote:


diff -urN mdadm-2.3.1/Assemble.c mdadm-2.3.1.exclusive/Assemble.c


please note that the patch was written while i was composing the email
as a proof-of-concept, it should not be considered working (or even
compiling code)

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lilo append= , A suggestion .

2006-02-13 Thread Luca Berra

On Mon, Feb 13, 2006 at 09:12:42PM -0700, Mr. James W. Laferriere wrote:

Hello Neil  All ,
I'll bet I am going to get harassed over this , but ...

The present form (iirc) of the lilo append statement is

append=md=d0,/dev/sda,/dev/sdb

I am wondering how difficult the below would be to code ?
This allows a (relatively) short strings to be append'd
instead of the sometimes large listing of devices .

append=md=d0,UUID=e9e0f605:9ed694c2:3e2002c9:0415c080

Ok ,  I got my asbestos brithes on .  Have at it ;-) .
Tia ,  JimL

what about all the past threads about in-kernel autodetection?

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: heavy problem with raid initialisation

2006-02-10 Thread Luca Berra

On Fri, Feb 10, 2006 at 11:11:24AM +0100, Guillaume Rousse wrote:

Hello.

I'm using software raid with mdadm 1.7.0 on a mandrake linux 10.1, but
I'm facing heavy initialisation troubles. The first array /dev/md0 is
automatically created and launched at startup (though mdadm -As in init
scripts), but not the second array /dev/md1.

mdadm --examine --scan --config=partitions creates the second array as
/dev/.tmp.md1, which I can then assemble using an explicit mdadm -A
/dev/sda2 /dev/sdb2 command, but this is unpractical and failproof :/



i think this issue was squashed in a newer version of mdadm
can you try rebuilding the current cooker rpm on 10.1 and try again.
i don't have a 10.1 laying around to test atm.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Question: array locking, possible?

2006-02-09 Thread Luca Berra

On Thu, Feb 09, 2006 at 10:28:58AM -0800, Stern, Rick (Serviceguard Linux) 
wrote:

There is more interest, just not vocal.

May want to look at LVM2 and its ability to use tagging to control enablement 
of VGs. This way it is not HW dependent.


I believe there is space in md1 superblock for a cluster/exclusive
flag, if not the name field could be used
what is missing is an interface between mdadm and cmcld so mdadm can ask
cmcld permission to activate an array with the cluster/exclusive flag
set.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-07 Thread Luca Berra

On Mon, Feb 06, 2006 at 06:47:54PM -0800, H. Peter Anvin wrote:

Neil Brown wrote:
Requiring that mdadm.conf describes the actual state of all volumes 
would be an enormous step in the wrong direction.  Right now, the Linux 
md system can handle some very oddball hardware changes (such as on 
hera.kernel.org, when the disks not just completely changed names due to 
a controller change, but changed from hd* to sd*!)

DEVICE partitions
ARRAY /dev/md0 UUID=xxyy:zzyy:aabb:ccdd

would catch that



Dynamicity is a good thing, although it needs to be harnessed.

 kernel parameter md_root_uuid=xxyy:zzyy:aabb:ccdd...
This could be interpreted by an initramfs script to run mdadm
to find and assemble the array with that uuid.  The uuid of
each array is reasonably unique.

I could change mdassemble to allow accepting an uuid on the command line
and assemble a /dev/md0 with the specified uuid (at the moment it only
accepts a configuration file, which i tought was enough for
initrd/initramfs.

This, in fact is *EXACTLY* what we're talking about; it does require 
autoassemble.  Why do we care about the partition types at all?  The 
reason is that since the md superblock is at the end, it doesn't get 
automatically wiped if the partition is used as a raw filesystem, and so 
it's important that there is a qualifier for it.

I don't like using partition type as a qualifier, there is people who do
not wish to partition their drives, there are systems not supporting
msdos like partitions, heck even m$ is migrating away from those.

In any case if that has to be done it should be done into mdadm, not
in a different scrip that is going to call mdadm (behaviour should be
consistent between mdadm invoked by initramfs and mdadm invoked on a
running system).

If the user wants to reutilize a device that was previously a member of
an md array he/she should use mdadm --zero-superblock to remove the
superblock.
I see no point in having a system that tries to compensate for users not
following correct procedures. sorry.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid 1 always degrades after a reboot.

2006-02-07 Thread Luca Berra

On Mon, Feb 06, 2006 at 03:08:07PM -0800, Hans Rasmussen wrote:

Hi all.

After every reboot, my brand new Raid1 array comes up degraded.  It's 
always /dev/sdb1 that is unavailable or removed.

...

Mandriva 2006 download edition, upgraded from Mandrake 9.1
Kernel 2.6.12-15mdk (not rebuilt)
MDADM version 00.90.01

this is the version of the md driver in the kernel, mdadm on 2006 is
1.12.0

could you please check if the array is stopped correctly at shutdown
maybe temporarily removing the splash thing from lilo/grub will allow
you to see more clearly.
Q. does the SATA300 TX2Plus card support fakeraid?
in case it does and it is enable that could conflict with md.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-07 Thread Luca Berra

On Tue, Feb 07, 2006 at 07:46:59AM -0800, H. Peter Anvin wrote:

Luca Berra wrote:


I don't like using partition type as a qualifier, there is people who do
not wish to partition their drives, there are systems not supporting
msdos like partitions, heck even m$ is migrating away from those.



That's why we're talking about non-msdos partitioning schemes.


this still leaves whole disks


If the user wants to reutilize a device that was previously a member of
an md array he/she should use mdadm --zero-superblock to remove the
superblock.
I see no point in having a system that tries to compensate for users not
following correct procedures. sorry.


You don't?  That surprises me... making it harder for the user to have 
accidental data loss sounds like a very good thing to me.


making it harder for the user is a good thing, but please not at the
expense of usability

the only way i see a user can have data loss is if
- a md array is stopped
- two different filesystems are created on the component devices
- these filesystems are filled with data, but not to the point of
 damaging the superblock
- then the array is started again.

if only one device is removed using mdadm the event counter would
prevent the array from being assembled again.

there are a lot of easier ways for shooting yourself in the feet :)

if we really want to be paranoid we should modify mkXXXfs to refuse
creating a filesystem if the device has an md superblock on it. (lvm2
tools are already able to ignore devices with md superblocks on them,
no clue about EVMS)

L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-07 Thread Luca Berra

On Tue, Feb 07, 2006 at 08:55:21AM -0800, H. Peter Anvin wrote:

Luca Berra wrote:


making it harder for the user is a good thing, but please not at the
expense of usability



What's the usability problem?


if we fail to support all partitioning schemes and we do not support
non partitioned devices.

if we manage to support all this without too much code bloat i'll shut
up.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hard drives with variable device names - mdadm raid assembly options setup

2005-12-31 Thread Luca Berra

On Sat, Dec 31, 2005 at 11:44:32AM +1100, Daniel Pittman wrote:

DEVICE /dev/hd*
DEVICE /dev/sd*


i really find
DEVICE partitions
to be more useful than shell patterns.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /proc/mdstat bug: 2.6.14.2

2005-12-01 Thread Luca Berra

On Tue, Nov 29, 2005 at 11:35:18AM -0800, Andrew Burgess wrote:

The time and speed display for resync is wrong, the recovery numbers are fine.
The resync is actually running at a few MB/sec.

md1 : active raid6 sdn1[8](S) sde1[9] sdq1[0] sdu1[6] sdo1[5] sdaa3[4] sdab1[2] 
sds1[1]
 1757815296 blocks level 6, 128k chunk, algorithm 2 [8/6] [UUU_UUU_]
 []  recovery =  3.6% (10616704/292969216) 
finish=840.3min speed=5597K/sec
 
md0 : active raid6 sdac2[0] sdz1[4] sdy1[2] sdx1[1] sdw1[3] sdv1[5] sdr2[6]

 1875299328 blocks level 6, 128k chunk, algorithm 2 [8/7] [UUU_]
 [==..]  resync = 33.1% (103563392/312549888) 
finish=1.5min speed=2288625K/sec

This is a amd64 x2 but running in single processor mode because of all the
timer problems with dual cpus


do you have powernow modules loaded on this kernel? they might be
tricking your internal clock.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: question regarding multipath Linux 2.6

2005-09-02 Thread Luca Berra

On Thu, Sep 01, 2005 at 02:51:44PM -0400, Jim Faulkner wrote:


Hello,

Recently my department had a SAN installed, and I am in the process of 
setting up one of the first Linux machines connected to it.  The machine 
is running Red Hat Enterprise AS4 (x86_64), which uses Linux kernel 
version 2.6.9-11.ELsmp.

giving more info about the infamous SAN would help :)


The SAN shows up twice in the kernel, as /dev/sdb and /dev/sdc.  /dev/sdb 
is inaccessible (I get a bunch of Buffer I/O error on device sdb kernel 
errors), but /dev/sdc works fine.  According to the administrator of the 

it probably is a cheapo storage with an Active/Passive storage
controller, you cannot use md to handle those.

He told me to use PowerPath, but I'd rather not have to reinstall or 

it is a long time i don't see powerpath on linux, but i am in favour of
ditching proprietary multipath solutions in favour of free ones.



what you want is multipath-tools http://christophe.varoqui.free.fr/
RH4 should already include a multipath-tools rpm.

Regards,
Luca

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can uuid of raid array be changed?

2005-04-19 Thread Luca Berra
On Mon, Apr 18, 2005 at 08:05:22PM -0500, John McMonagle wrote:
Luca Berra wrote:
On Sun, Apr 17, 2005 at 05:04:13PM -0500, John McMonagle wrote:
Need to duplicate some computers that are using raid 1.
I was thinking of just adding adding an extra drive and then moving 
it to the new system. The only problem is the clones will all have 
the same uuids.  If at some later date the drives got mixed up I 
could see a possibilities for disaster.  Not exactly likely as the 
computers will be in different cities.

Is there a way to change the uuid if a raid array?
Is it really worth worrying about?
you can recreate the array, this will not damage existing data.
L.
Thanks
I'll try it.
I suspect I'll find out real quick but do you need to a 
--zero-superblock  on all  devices making the raid arrays?
NO
Will this damage the lvm2 superblock info?
Probably a good idea to do a vgcfgback just to be safe..
NO
the idea is after you cloned the drive, create a new array with the
force flag and using as components the cloned disk and the magic word
missing, this will create a new degraded array and won't touch any
data.
you can then hotadd a new drive to this array, it will fill the slot
used by the missing keyword.
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can uuid of raid array be changed?

2005-04-18 Thread Luca Berra
On Sun, Apr 17, 2005 at 05:04:13PM -0500, John McMonagle wrote:
Need to duplicate some computers that are using raid 1.
I was thinking of just adding adding an extra drive and then moving it 
to the new system. The only problem is the clones will all have the same 
uuids.  If at some later date the drives got mixed up I could see a 
possibilities for disaster.  Not exactly likely as the computers will be 
in different cities.

Is there a way to change the uuid if a raid array?
Is it really worth worrying about?
you can recreate the array, this will not damage existing data.
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: interesting failure scenario

2005-04-04 Thread Luca Berra
On Mon, Apr 04, 2005 at 01:59:09AM +0400, Michael Tokarev wrote:
I just come across an interesting situation, here's the
scenario.
C'mon, there's plenty of ways for you to shoot yourself in the feet. :)
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 and data safety?

2005-03-29 Thread Luca Berra
On Tue, Mar 29, 2005 at 01:29:22PM +0200, Peter T. Breuer wrote:
Neil Brown [EMAIL PROTECTED] wrote:
Due to the system crash the data on hdb is completely ignored.  Data
Neil - can you explain the algorithm that stamps the superblocks with
an event count, once and for all? (until further amendment :-).
IIRC it is updated at every event (start, stop, add, remove, fail etc...)
It goes without saying that sb's are not stamped at every write, and the
event count is not incremented at every write, so when and when?
the event count is not incremented at every write, but the dirty flag
is, and it is cleared lazily after some idle time.
in older code it was set at array start and cleared only at stop.
so in case of a disk failure the other disks get updated about the
failure.
in case of a restart (crash) the array will be dirty and a coin tossed
to chose which mirror to use as an authoritative source (the coin is
biased, but it doesn't matter). At this point any possible parallel
reality is squashed out of existance.
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] md bitmap bug fixes

2005-03-23 Thread Luca Berra
On Tue, Mar 22, 2005 at 11:02:16AM +0100, Peter T. Breuer wrote:
Luca Berra [EMAIL PROTECTED] wrote:
If we want to do data-replication, access to the data-replicated device
should be controlled by the data replication process (*), md does not
guarantee this.
Well, if one writes to the md device, then md does guarantee this - but
I find it hard to parse the statement. Can you elaborate a little in
order to reduce my possible confusion?
I'll try
in fault tolerant architechture where we have two systems each with a
local storage which is exposed to the other system via nbd or similar.
One node is active and writes data to an md device composed from the
local storage and the nbd device.
The other node is stand-by and ready to take the place of the former in
case it fails.
I assume the data replication is synchronous at the moment (the write system
call returns when io has been submitted to both the underlying devices) (*) 

we can have a series of failures which must be accounted for and dealt
with according to a policy that might be site specific.
A) Failure of the standby node
 A.1) the active is allowed to continue in the absence of a data replica
 A.2) disk writes from the active should return an error.
 we can configure this setting in advance.
B) Failure of the active node
 B.1) the standby node takes immediately ownership of data and resumes
 processing
 B.2) the standby node remains idle
C) communication failure between the two nodes (and we don't have an
external mechanism to arbitrate the split brain condition)
 C.1) both system panic and halt
 C.2) A1 + B2
 C.3) A2 + B2
 C.4) A1 + B1
 C.5) A2 + B1 (which hopefully will go to A2 itself)
D) communication failure between the two nodes (admitting we have an
external mechanism to arbitrate the split brain condition)
 D.1) A1 + B2
 D.2) A2 + B2
 D.2) B1 then A1
 D.3) B1 then A2
E) rolling failure (C, then B)
F) rolling failure (D, then B)
G) a failed nodes is restored
H) a node (re)starts while the other is failed
I) a node (re)starts during C
J) a node (re)starts during D
K) a node (re)starts during E
L) a node (re)starts during F
scenarios without a sub-scenarios are left as an exercise to the reader,
or i might find myself losing a job :)
now evaluate all scenarios under the following drivers:
1) data availability above all others
2) replica of data above all others
3) data availability above replica, but data consistency above
availability
(*) if you got this far, add asynchronous replicas to the picture.
Regards,
Luca
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] md bitmap bug fixes

2005-03-22 Thread Luca Berra
On Mon, Mar 21, 2005 at 02:58:56PM -0500, Paul Clements wrote:
Luca Berra wrote:
On Mon, Mar 21, 2005 at 11:07:06AM -0500, Paul Clements wrote:
All I'm saying is that in a split-brain scenario, typical cluster 
frameworks will make two (or more) systems active at the same time. This 
I sincerely hope not.
Perhaps my choice of wording was not the best? I probably should have 
said, there is no foolproof way to guarantee that two systems are not 
active. Software fails, human beings make mistakes, and surely even 
STONITH devices can be misconfigured or can fail (or cannot be used for 
one reason or another).
well, careful use of an arbitrator node, possibly in a different
location, helps avoiding split-brains, and stonith is a requirement
At any rate, this is all irrelevant given the second part of that email 
reply that I gave. You still have to do the bitmap combining, regardless 
of whether two systems were active at the same time or not.
I still maintain that doing data-replication with md over nbd is a
painful and not very useful exercise.
If we want to do data-replication, access to the data-replicated device
should be controlled by the data replication process (*), md does not
guarantee this.
(*) i.e. my requirements could be that having a replicated transaction
is more important that completing the transaction itself, so i might
want to return a disk error in case replica fails.
or to the contrary i might want data availability above all else, maybe
data does not change much.
or something in between, data availability above replication, but
data validity over availability. this is probably the most common
scenario, and the more difficult to implement correctly.
In any case it must be possible to control exactly which steps should be
automatically done in case of failure. and in case of rollback, with the
sane default would be die rather than modify any data, in case of
doubt.
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] md bitmap bug fixes

2005-03-21 Thread Luca Berra
On Mon, Mar 21, 2005 at 11:07:06AM -0500, Paul Clements wrote:
All I'm saying is that in a split-brain scenario, typical cluster 
frameworks will make two (or more) systems active at the same time. This 
I sincerely hope not.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md autodetection

2005-03-21 Thread Luca Berra
On Mon, Mar 21, 2005 at 09:06:44AM +0100, Nils-Henner Krueger wrote:
I've got a few questions about raid autodetection at boot time.
i usually have a single answer, don't use it.
but i see you are struck with rh, so...

To use a mirrored boot device, I have to use raid autodetection
to have the md devices running early enough?
yes
To have autodection recognise a disk device, it has to run after
the appropriate device driver, right?
yes
In my current setup, running RHEL3 with kernel 2.4.21, I have
the system installed on a scsi disk. The scsi driver ist loaded
was that an _is_ or an _isn't_ ???
as module and added to the initrd.
Currently raid autodection takes place long before the scsi
device driver is loaded:
redhats' mkinitrd reruns autodetection after module loading.

Is there a way to run raid autodection (again?) later in the
boot sequence to detect not only IDE devices but other things
like scsi, too?
you better use mdadm instead.
i even wrote a tool, mdassemble which is shipped with mdadm and is
suited for replacing autodetection in initrds.
Last question:  :-)
Is it possible to make a distinction between some md-devices which
should be detected and started by raid autodetection and others which
should be left untouched?
autodetection only considers type fd partition, mdadm and raidstart
don't actually care.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] md bitmap bug fixes

2005-03-18 Thread Luca Berra
On Fri, Mar 18, 2005 at 02:42:55PM +0100, Lars Marowsky-Bree wrote:
The problem is for multi-nodes, both sides have their own bitmap. When a
split scenario occurs, and both sides begin modifying the data, that
bitmap needs to be merged before resync, or else we risk 'forgetting'
that one side dirtied a block.
on a side note i am wondering what would the difference be on using this
approach within the md driver versus DRBD?
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with auto-assembly on Itanium

2005-03-10 Thread Luca Berra
On Thu, Mar 10, 2005 at 11:03:44AM +0100, Jimmy Hedman wrote:
On Wed, 2005-03-09 at 17:43 +0100, Luca Berra wrote:
On Wed, Mar 09, 2005 at 11:28:48AM +0100, Jimmy Hedman wrote:
Is there any way i can make this work? Could it be doable with mdadm in
a initrd?

mdassembled was devise for this purpose.
create an /etc/mdadm.conf with
echo DEVICE partitions  /etc/mdadm.conf
/sbin/mdadm -D -b /dev/md0 | grep '^ARRAY'  /etc/mdadm.conf
copy the mdadm.conf and mdassemble to initrd
make linuxrc run mdassemble.
So there are no way of doing it the same way i386 does it, ie scanning
the partitions and assembly the raid by it self? Is this a bug on the
itanium (GPT partition scheme) or is this intentional?
if you mean the in-kernel autodetect junk, you should only be happy it
does not work on your system, so you are not tempted to use it.
even on i386 it is badly broken, and i won't return on the subject.
it has been discussed on this list to boredom.
L.
btw. you don't need cc-ing me. i read the list.
L.
--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >