Re: Care and feeding of RAID?

2006-09-05 Thread Gordon Henderson
On Tue, 5 Sep 2006, Paul Waldo wrote:

> Gordon Henderson wrote:
> > On Tue, 5 Sep 2006, Steve Cousins wrote:
> [snip]
> > and my weekly badblocks script looks like:
> >
> > #!/bin/csh
> >
> > echo "`uname -n`: Badblocks test starting at [`date`]"
> >
> > foreach disk ( a c )
> >   foreach partition ( 1 2 3 5 6 )
> > echo -n "hd$disk${partition}: "
> > badblocks -c 128 /dev/hd$disk$partition
> >   end
> >   echo ""
> > end
> >
> > echo "`uname -n`: Badblocks test   ending at [`date`]"
> [snip]
>
> Maybe I'm missing something, but are these partitions mounted?  Here's what I
> get when I do this on a mounted partition:
>
> [EMAIL PROTECTED] ~]# badblocks -nsv /dev/md0
> /dev/md0 is mounted; it's not safe to run badblocks!

Do not use the -n option... (and -s won't be much use in a cron job, nor
-v, probably!) -n will write to the device which might well have issues
with the filesystem cache...

By reading the underlying drives you won't trigger a raid array failure
should you do see a bad sector, which might give you time to go something
about it. There was some emails on this list some time back (year or 2,3?)
about badblocking the md? device - I imagine it might not read every block
of every device unless it was a raid-0 array...

Gordon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Luca Berra

On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:



Benjamin Schieder wrote:

On 05.09.2006 11:03:45, Steve Cousins wrote:

Would people be willing to list their setup? Including such things as 
mdadm.conf file, crontab -l, plus scripts that they use to check the 
smart data and the array, mdadm daemon parameters and anything else that 
is relevant to checking and maintaining an array? 



Personally, I use this script from cron:
http://shellscripts.org/project/hdtest


nice race :)


I am checking this out and I see that you are the writer of this script.
I'm getting errors when it comes to lines 76 and 86-90 about the 
arithmetic symbols. This is on a Fedora Core 5 system with bash version 

that is because smartctl output has changed and the grep above returns
no number.

3.1.7(1).   I weeded out the smartctl command and tried it manually with 
no luck on my SATA /dev/sd? drives.


which command?


What do you (or others) recommend for SATA drives?


smartmontools and a recent kernel just work.
also you can schedule smart tests with smartmontools. so you don't need
to cron scripts.

L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Richard Scobie
This is a timely thread for me, as I am about to setup a software RAID 
10 (a striped pair of mirrors), on 4 x 500GB SATA.


Anything to watch for by not partitioning the drives at all? Or is it 
safer to make one partition, slightly smaller (suggestions of how much 
welcome), than the full drive, to allow for possible size discrepencies 
with replacemnets.


Also I am wondering as this is RAID0 on top of RAID1, if there are any 
special steps that need to be taken when maintaining the array (adding, 
removing, rebuilding etc), compared with a "single layer" RAID?


Regards,

Richard
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Steve Cousins



Rev. Jeffrey Paul wrote:

On Tue, Sep 05, 2006 at 11:03:45AM -0400, Steve Cousins wrote:

These are SATA drives and except for the one machine that has a 3Ware 
8506 card in it I haven't been able to get SMART programs to do anything 
with these drives.  How do others deal with this? 




I use the tw_cli program to check up on my 3ware stuff.


Hi Jeffrey,

Thanks.  I use tw_cli too and I have scripted a check to see if it 
degrades but this doesn't help with checking for disk problems before 
they happen which SMART should help with.  As it happens, smartctl works 
with 3Ware SATA drives.  It is my other SATA drives that I'm unable to 
monitor.


Steve




It took me quite a bit of time to figure that one out.  I don't
have any automated monitoring set up, but it'd be simple enough to
script.  I check on the array every so often and run a verify every few
months to see if it kicks a disk out (it hasn't yet).

0 [EMAIL PROTECTED]:~# tw_cli 
//datavibe> info


Ctl   ModelPorts   Drives   Units   NotOpt   RRate   VRate   BBU

c08006-2LP 2   21   02   -   -


//datavibe> info c0

Unit  UnitType  Status %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
--
u0RAID-1OK -  -   232.885   ON -- 


Port   Status   Unit   SizeBlocksSerial
---
p0 OK   u0 232.88 GB   488397168 WD-WMAL718611 
p1 OK   u0 232.88 GB   488397168 WD-WMAL718619

//datavibe>

-j



--
__
 Steve Cousins, Ocean Modeling GroupEmail: [EMAIL PROTECTED]
 Marine Sciences, 452 Aubert Hall   http://rocky.umeoce.maine.edu
 Univ. of Maine, Orono, ME 04469Phone: (207) 581-4302


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Steve Cousins



Benjamin Schieder wrote:

On 05.09.2006 11:03:45, Steve Cousins wrote:

Would people be willing to list their setup? Including such things as 
mdadm.conf file, crontab -l, plus scripts that they use to check the 
smart data and the array, mdadm daemon parameters and anything else that 
is relevant to checking and maintaining an array? 



Personally, I use this script from cron:
http://shellscripts.org/project/hdtest


Hi Benjamin,

I am checking this out and I see that you are the writer of this script.
I'm getting errors when it comes to lines 76 and 86-90 about the 
arithmetic symbols. This is on a Fedora Core 5 system with bash version 
3.1.7(1).   I weeded out the smartctl command and tried it manually with 
no luck on my SATA /dev/sd? drives.


What do you (or others) recommend for SATA drives?

Thanks,

Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Rev. Jeffrey Paul
On Tue, Sep 05, 2006 at 11:03:45AM -0400, Steve Cousins wrote:
> 
> These are SATA drives and except for the one machine that has a 3Ware 
> 8506 card in it I haven't been able to get SMART programs to do anything 
> with these drives.  How do others deal with this? 
> 

I use the tw_cli program to check up on my 3ware stuff.

It took me quite a bit of time to figure that one out.  I don't
have any automated monitoring set up, but it'd be simple enough to
script.  I check on the array every so often and run a verify every few
months to see if it kicks a disk out (it hasn't yet).

0 [EMAIL PROTECTED]:~# tw_cli 
//datavibe> info

Ctl   ModelPorts   Drives   Units   NotOpt   RRate   VRate   BBU

c08006-2LP 2   21   02   -   -

//datavibe> info c0

Unit  UnitType  Status %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
--
u0RAID-1OK -  -   232.885   ON -- 

Port   Status   Unit   SizeBlocksSerial
---
p0 OK   u0 232.88 GB   488397168 WD-WMAL718611 
p1 OK   u0 232.88 GB   488397168 WD-WMAL718619
//datavibe>

-j

-- 

 Rev. Jeffrey Paul-datavibe- [EMAIL PROTECTED]
  aim:x736e65616b   pgp:0xD9B3C17D   phone:877-748-3467
   9440 0C7F C598 01CA 2F17  D098 0A3A 4B8F D9B3 C17D

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Paul Waldo

Gordon Henderson wrote:

On Tue, 5 Sep 2006, Steve Cousins wrote:

[snip]

and my weekly badblocks script looks like:

#!/bin/csh

echo "`uname -n`: Badblocks test starting at [`date`]"

foreach disk ( a c )
  foreach partition ( 1 2 3 5 6 )
echo -n "hd$disk${partition}: "
badblocks -c 128 /dev/hd$disk$partition
  end
  echo ""
end

echo "`uname -n`: Badblocks test   ending at [`date`]"

[snip]

Maybe I'm missing something, but are these partitions mounted?  Here's what I 
get when I do this on a mounted partition:


[EMAIL PROTECTED] ~]# badblocks -nsv /dev/md0
/dev/md0 is mounted; it's not safe to run badblocks!

If you are running RAID, is it safe to run badblocks on the underlying 
partition?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Paul Waldo

Paul Waldo wrote:

Hi all,

I have a RAID6 array and I wondering about care and feeding instructions 
:-)


Here is what I currently do:
   - daily incremental and weekly full backups to a separate machine
   - run smartd tests (short once a day, long once a week)
   - check the raid for bad blocks every week

What else can I do make sure the array keeps humming?  Thanks in advance!
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



What about bitmaps?  Nobody has mentioned them.  It is my understanding that 
you just turn them on with "mdadm /dev/mdX -b internal".  Any caveats for this?


Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Paul Waldo

Steve Cousins wrote:

Gordon Henderson wrote:


On Tue, 5 Sep 2006, Paul Waldo wrote:

 


Hi all,

I have a RAID6 array and I wondering about care and feeding 
instructions :-)


Here is what I currently do:
   - daily incremental and weekly full backups to a separate machine
   - run smartd tests (short once a day, long once a week)
   - check the raid for bad blocks every week

What else can I do make sure the array keeps humming?  Thanks in 
advance!
  


Stop fiddling with it :)

I run similar stuff, but don't forget running mdadm in daemon mode to 
send

you an email should a drive fail. I also check each device individually,
rather than the array although I don't know the value of doing this over
the SMART tests on modern drives though...
 



Would people be willing to list their setup? Including such things as 
mdadm.conf file, crontab -l, plus scripts that they use to check the 
smart data and the array, mdadm daemon parameters and anything else that 
is relevant to checking and maintaining an array?
I'm running the mdmonitor script at startup and a sample mdadm.conf  
(one of 3 machines) looks like:


MAILADDR [EMAIL PROTECTED]
ARRAY /dev/md0 level=raid5 num-devices=3 
UUID=39d07542:f3c97e69:fbb63d9d:64a052d3 
devices=/dev/sdb1,/dev/sdc1,/dev/sdd1


These are SATA drives and except for the one machine that has a 3Ware 
8506 card in it I haven't been able to get SMART programs to do anything 
with these drives.  How do others deal with this?

Thanks,

Steve



Excellent idea, Steve.

In my crontab, I have this:
# Check RAID arrays for bad blocks once a week
30 2 * * Tue echo check >> /sys/block/md0/md/sync_action ; echo "Checking md0 bad 
blocks"
30 2 * * Wed echo check >> /sys/block/md1/md/sync_action ; echo "Checking md1 bad 
blocks"

I have this in my smartd.conf:
/dev/hda -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hdc -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hde -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hdg -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sda -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sdb -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sdc -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)

My Fedora Core box has this in /etc/init.d/mdmonitor:
daemon --check --user=root mdadm ${OPTIONS}
where OPTIONS="--monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid"


I have no mdadm.conf.  My entire filesystem consists of md0 (/boot) and md1(/).  
I figure if I have problems and need the file, it won't be available anyway.  If I am mistaken,

please do let me know!

Any other suggestions would be welcomed!

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Gordon Henderson
On Tue, 5 Sep 2006, Steve Cousins wrote:

> Would people be willing to list their setup? Including such things as
> mdadm.conf file, crontab -l, plus scripts that they use to check the
> smart data and the array, mdadm daemon parameters and anything else that
> is relevant to checking and maintaining an array?

I don't have any mdadm.conf files ... What am I missing? (I've always been
under the impression that after needing the /etc/raidtab file with the old
raidtools, you didn't need a config file as such under mdadm...  However,
I'm willing to be enlightened!)

For checking the smart stuff, I use the standard Debian packages and a
smartd.conf file typically looks like:

#DEVICESCAN

/dev/hda -d ata -o on -S on -a -m [EMAIL PROTECTED] -s 
(S/../.././04|L/../../1/20) -M daily -M test
/dev/hdc -d ata -o on -S on -a -m [EMAIL PROTECTED] -s 
(S/../.././04|L/../../1/20) -M daily
/dev/hde -d ata -o on -S on -a -m [EMAIL PROTECTED] -s 
(S/../.././04|L/../../1/20) -M daily
/dev/hdi -d ata -o on -S on -a -m [EMAIL PROTECTED] -s 
(S/../.././04|L/../../1/20) -M daily

The running mdadm in monitor mode looks like:

  /sbin/mdadm -F -i /var/run/mdadm.pid -m root -f -s

and my weekly badblocks script looks like:

#!/bin/csh

echo "`uname -n`: Badblocks test starting at [`date`]"

foreach disk ( a c )
  foreach partition ( 1 2 3 5 6 )
echo -n "hd$disk${partition}: "
badblocks -c 128 /dev/hd$disk$partition
  end
  echo ""
end

echo "`uname -n`: Badblocks test   ending at [`date`]"

I do loads of stuff with disk temperatures (when I can), etc. but thats
just for making pretty graphs I can point at my customers... (eg
http://lion.drogon.net/mrtg/diskTemp.html and tell me when that data
centre upgraded their AC ;-)

Gordon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Gordon Henderson
On Tue, 5 Sep 2006, Patrik Jonsson wrote:

> mtbf seems to have an exponential dependence on temperature, so it pays
> off to keep temp down. Exactly what temp you consider safe is
> individual, but my drives only occasionally go above 40C.

I had a pair (2 x Hitachi IDE 80GB) that ran in a sealed case at the top
of a lift-shaft for 2 years. They averaged 55C... I never got to see the
box after it was decomissioned...

Gordon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checking md device parity (forced resync) - is it necessary?

2006-09-05 Thread Luca Berra

On Tue, Sep 05, 2006 at 02:00:03PM +0200, Tomasz Chmielewski wrote:

# by default, run at 01:06 on the first Sunday of each month.
6 1 1-7 * 7 root [ -x /usr/share/mdadm/checkarray ] && 
/usr/share/mdadm/checkarray --cron --all --quiet


However, it will run at 01:06, on 1st-7th day of each month, and on 
Sundays (Debian etch).

hihihi
monthday and weekday are or-ed in crontab 


L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Patrik Jonsson
On 05.09.2006 08:48:44, Paul Waldo wrote:
>> Hi all,
>>
>> I have a RAID6 array and I wondering about care and feeding instructions :-)
>>
>> Here is what I currently do:
>>- daily incremental and weekly full backups to a separate machine
>>- run smartd tests (short once a day, long once a week)
>>- check the raid for bad blocks every week
>>
>> What else can I do make sure the array keeps humming?  Thanks in advance!
>> 
Make sure the drives are adequately cooled. I use this nifty utility to
look at my drive temps:
http://martybugs.net/linux/hddtemp.cgi

mtbf seems to have an exponential dependence on temperature, so it pays
off to keep temp down. Exactly what temp you consider safe is
individual, but my drives only occasionally go above 40C.

cheers,

/Patrik




signature.asc
Description: OpenPGP digital signature


Re: RAID5 producing fake partition table on single drive

2006-09-05 Thread Luca Berra

On Mon, Sep 04, 2006 at 01:55:52PM -0400, Bill Davidsen wrote:

Doug Ledford wrote:

It's the mount program collecting possible LABEL= data on the partitions
listed in /proc/partitions, of which sde3 is outside the valid range for
the drive.

May I belatedly say that this is sort-of a kernel issue, since 
/proc/partitions reflects invalid data? Perhaps a boot option like 
nopart=sda,sdb or similar would be in order?



i would move partition detection code to user space completely, so it
could be ran selectively on the drives that do happen to have a patition
table.

a compromise could be having mdadm (or the script that starts mdadm at
boot time, issue an ioctl(fd,BLKPG,...) to make kernel forget about any
eventual partition table it might have misdetected
L.

--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Mike Hardy


Steve Cousins wrote:

> MAILADDR [EMAIL PROTECTED]
> ARRAY /dev/md0 level=raid5 num-devices=3
> UUID=39d07542:f3c97e69:fbb63d9d:64a052d3
> devices=/dev/sdb1,/dev/sdc1,/dev/sdd1

If you list the devices explicitly, you're opening the possibility for
errors when the devices are re-ordered following insertion (or removal)
of any other SATA or SCSI (or USB storage) device

I think you want is a "DEVICE partitions" line accompanied by ARRAY
lines that have the UUID attribute you've already got in there.

-Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Benjamin Schieder
On 05.09.2006 11:03:45, Steve Cousins wrote:
> Would people be willing to list their setup? Including such things as 
> mdadm.conf file, crontab -l, plus scripts that they use to check the 
> smart data and the array, mdadm daemon parameters and anything else that 
> is relevant to checking and maintaining an array? 

Personally, I use this script from cron:
http://shellscripts.org/project/hdtest
0 3 * * * root /root/sbin/hdtest.sh -l 
/var/log/smart_ata-ST3250624A_4ND33CLT.log 
/dev/disk/by-id/ata-ST3250624A_4ND33CLT short
1 3 * * * root /root/sbin/hdtest.sh -l 
/var/log/smart_ata-ST3250624A_4ND33EJE.log 
/dev/disk/by-id/ata-ST3250624A_4ND33EJE short
2 3 * * * root /root/sbin/hdtest.sh -l 
/var/log/smart_ata-ST3250624A_4ND33ELA.log 
/dev/disk/by-id/ata-ST3250624A_4ND33ELA short

I have made the experience that long tests slow down the raid to a point
where the system becomes unusable.

My mdadm.conf is like this:
---
DEVICE partitions
ARRAY /dev/md/0 level=raid1 num-devices=3 
UUID=3559ffcf:14eb9889:3826d6c2:c13731d7
ARRAY /dev/md/1 level=raid5 num-devices=3 
UUID=649fc7cc:d4b52c31:240fce2c:c64686e7
ARRAY /dev/md/2 level=raid5 num-devices=3 
UUID=9a3bf634:58f39e44:27ba8087:d5189766
   spares=1
ARRAY /dev/md/4 level=raid5 num-devices=3 
UUID=d4799be3:5b157884:e38718c2:c05ab840
   spares=1
ARRAY /dev/md/5 level=raid5 num-devices=3 
UUID=ca4a6110:4533d8d5:0e2ed4e1:2f5805b2
   spares=1

MAIL [EMAIL PROTECTED]
---

Also, I run

mdadm --monitor /dev/md/* --daemonise

from an init script.

Greetings,
Benjamin
-- 
 _  _ _   __   
| \| |___| |_| |_  __ _ __| |__
| .` / -_)  _| ' \/ _` / _| / /
|_|\_\___|\__|_||_\__,_\__|_\_\
| |  (_)_ _ _  ___ __
| |__| | ' \ || \ \ /
||_|_||_\_,_/_\_\
Play Nethack anywhere with an x86 computer:
http://www.crash-override.net/nethacklinux.html


pgp2ufnllRcnR.pgp
Description: PGP signature


Re: Care and feeding of RAID?

2006-09-05 Thread Steve Cousins

Gordon Henderson wrote:


On Tue, 5 Sep 2006, Paul Waldo wrote:

 


Hi all,

I have a RAID6 array and I wondering about care and feeding instructions :-)

Here is what I currently do:
   - daily incremental and weekly full backups to a separate machine
   - run smartd tests (short once a day, long once a week)
   - check the raid for bad blocks every week

What else can I do make sure the array keeps humming?  Thanks in advance!
   



Stop fiddling with it :)

I run similar stuff, but don't forget running mdadm in daemon mode to send
you an email should a drive fail. I also check each device individually,
rather than the array although I don't know the value of doing this over
the SMART tests on modern drives though...
 



Would people be willing to list their setup? Including such things as 
mdadm.conf file, crontab -l, plus scripts that they use to check the 
smart data and the array, mdadm daemon parameters and anything else that 
is relevant to checking and maintaining an array? 

I'm running the mdmonitor script at startup and a sample mdadm.conf  
(one of 3 machines) looks like:


MAILADDR [EMAIL PROTECTED]
ARRAY /dev/md0 level=raid5 num-devices=3 
UUID=39d07542:f3c97e69:fbb63d9d:64a052d3 
devices=/dev/sdb1,/dev/sdc1,/dev/sdd1


These are SATA drives and except for the one machine that has a 3Ware 
8506 card in it I haven't been able to get SMART programs to do anything 
with these drives.  How do others deal with this? 


Thanks,

Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID over Firewire

2006-09-05 Thread Bill Davidsen

Richard Scobie wrote:


Bill Davidsen wrote:



It should work, but I don't like it... it leaves you with a lot of 
exposure between backups.


Unless your data change a lot, you might consider a good incremental 
dump program to DVD or similar.



Thanks. I have abandoned this option for various reasons, including 
people randomly unplugging the drives.


Rsync to another machine is the current plan. 


At one time I was evaluating doing RAID1 to an NBD on another machine, 
using write-mostly to make it a one way process. I had to redeplot the 
hardware before I reached a conclusion, and it was with an older kernel, 
so I simply throw it out for discussion.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Gordon Henderson
On Tue, 5 Sep 2006, Paul Waldo wrote:

> Hi all,
>
> I have a RAID6 array and I wondering about care and feeding instructions :-)
>
> Here is what I currently do:
> - daily incremental and weekly full backups to a separate machine
> - run smartd tests (short once a day, long once a week)
> - check the raid for bad blocks every week
>
> What else can I do make sure the array keeps humming?  Thanks in advance!

Stop fiddling with it :)

I run similar stuff, but don't forget running mdadm in daemon mode to send
you an email should a drive fail. I also check each device individually,
rather than the array although I don't know the value of doing this over
the SMART tests on modern drives though...

Gordon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Care and feeding of RAID?

2006-09-05 Thread Benjamin Schieder
On 05.09.2006 08:48:44, Paul Waldo wrote:
> Hi all,
> 
> I have a RAID6 array and I wondering about care and feeding instructions :-)
> 
> Here is what I currently do:
>- daily incremental and weekly full backups to a separate machine
>- run smartd tests (short once a day, long once a week)
>- check the raid for bad blocks every week
> 
> What else can I do make sure the array keeps humming?  Thanks in advance!

The mdadm man-page has information about running mdadm from cron to check
for 'unusual' activity.
You may want to consider that. I run it as daemon, personally.

Greetings,
Benjamin
-- 
#!/bin/sh #!/bin/bash #!/bin/tcsh #!/bin/csh #!/bin/kiss #!/bin/ksh
#!/bin/pdksh #!/usr/bin/perl #!/usr/bin/python #!/bin/zsh #!/bin/ash

Feel at home? Got some of them? Want to show some magic?

http://shellscripts.org


pgp0pYOEW9vym.pgp
Description: PGP signature


Care and feeding of RAID?

2006-09-05 Thread Paul Waldo

Hi all,

I have a RAID6 array and I wondering about care and feeding instructions :-)

Here is what I currently do:
   - daily incremental and weekly full backups to a separate machine
   - run smartd tests (short once a day, long once a week)
   - check the raid for bad blocks every week

What else can I do make sure the array keeps humming?  Thanks in advance!
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checking md device parity (forced resync) - is it necessary?

2006-09-05 Thread Tomasz Chmielewski

Neil Brown wrote:

(...)

Which starts a resync of drives. As one can imagine, resync of 800 GB on 
a rather slow device (600 MHz ARM) can take 12 hours or so...




I believe that was intended to be once a month, not once a day.
Slight error in crontab.


So my question is: is this "daily forced resync" necessary?


Daily is probably excessive, certainly on an array that size.

Monthly is good.  Weekly might be justified on cheap (i.e. unreliable)
drives and very critical data.

With RAID, sleeping bad blocks can be bad.  If you hit one while
recovering a failed drive, you have to put the piece back together by
hand.
A regular check can wake up those sleeping bad blocks.


Thanks a lot for clarification.


When can one need to run a "daily forced resync", and in which 
circumstances?


As I said, I think the 'daily' is an error.  What exactly do you have
in crontab??


Indeed, the crontab entry is wrong:

# by default, run at 01:06 on the first Sunday of each month.
6 1 1-7 * 7 root [ -x /usr/share/mdadm/checkarray ] && 
/usr/share/mdadm/checkarray --cron --all --quiet



However, it will run at 01:06, on 1st-7th day of each month, and on 
Sundays (Debian etch).



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: large copy to mdadm array fails, marks readonly

2006-09-05 Thread Neil Brown
On Tuesday September 5, [EMAIL PROTECTED] wrote:
> hello all, first message to this group.
> 
> Ive recently created a 3 disk raid5 using mdadm. file copy to this
> array fails consistently after a few gigs. the array does not die
> completely, but rather it is marked readonly.

What kernel?

Have you tried running memtest?  It sounds a lot like a hardware
error, though there was a bug in 2.4.0 that could have caused this
sort of thing

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checking md device parity (forced resync) - is it necessary?

2006-09-05 Thread Neil Brown
On Tuesday September 5, [EMAIL PROTECTED] wrote:
> Lately I installed Debian on a Thecus n4100 machine.
> It's a 600 MHz ARM storage device, and has 4 x 400 GB drives.
> 
> I made Linux software RAID on these drives:
> - RAID-1  - ~1 GB for system (/)
> - RAID-1  - ~1 GB for swap
> - RAID-10 - ~798 GB for iSCSI storage
> 
> 
> I noticed that each day the device slows down; a quick investigation 
> discovered that Debian runs a "checkarray" script each night at 1 am 
> (via cron). The essence of "checkarray" script is basically this:
> 
> echo check > /sys/block/$dev/md/sync_action
> 
> Which starts a resync of drives. As one can imagine, resync of 800 GB on 
> a rather slow device (600 MHz ARM) can take 12 hours or so...
> 

I believe that was intended to be once a month, not once a day.
Slight error in crontab.

> 
> So my question is: is this "daily forced resync" necessary?

Daily is probably excessive, certainly on an array that size.

Monthly is good.  Weekly might be justified on cheap (i.e. unreliable)
drives and very critical data.

With RAID, sleeping bad blocks can be bad.  If you hit one while
recovering a failed drive, you have to put the piece back together by
hand.
A regular check can wake up those sleeping bad blocks.

> 
> Perhaps in some cases, yes, because someone wrote that tool which does 
> it daily.
> 
> On the other hand, if we consider Linux software RAID stable, such a 
> resync would be only needed in some rare situations.

It has little to do with the stability of Linux software RAID and a
lot to do with stability of modern disk drives.

> 
> When can one need to run a "daily forced resync", and in which 
> circumstances?

As I said, I think the 'daily' is an error.  What exactly do you have
in crontab??

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


large copy to mdadm array fails, marks readonly

2006-09-05 Thread Alan Gibson

hello all, first message to this group.

Ive recently created a 3 disk raid5 using mdadm. file copy to this
array fails consistently after a few gigs. the array does not die
completely, but rather it is marked readonly.

I created the array on 3 disks of equal partition size with:
sudo mdadm /dev/md0 --create -l5 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1

then formatted with:
mkfs.ext3 /dev/md0

badblocks shows no error on any of the drives.

[EMAIL PROTECTED]:~$ cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sda1[0] sdb1[2] sdc1[1]
 234436352 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: 

[EMAIL PROTECTED]:~$ dmesg
[17180879.388000] EXT3 FS on md0, internal journal
[17180879.392000] EXT3-fs: mounted filesystem with ordered data mode.
[17181702.872000] EXT3-fs error (device md0): ext3_new_block:
Allocating block in system zone - block = 38993924
[17181702.872000] Aborting journal on device md0.
[17181702.952000] __journal_remove_journal_head: freeing b_committed_data
[17181702.956000] EXT3-fs error (device md0) in ext3_prepare_write:
Journal has aborted
[17181702.96] ext3_abort called.
[17181702.96] EXT3-fs error (device md0): ext3_journal_start_sb:
Detected aborted journal
[17181702.96] Remounting filesystem read-only

im stumped, any ideas?

thanks much,
alan
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


checking md device parity (forced resync) - is it necessary?

2006-09-05 Thread Tomasz Chmielewski

Lately I installed Debian on a Thecus n4100 machine.
It's a 600 MHz ARM storage device, and has 4 x 400 GB drives.

I made Linux software RAID on these drives:
- RAID-1  - ~1 GB for system (/)
- RAID-1  - ~1 GB for swap
- RAID-10 - ~798 GB for iSCSI storage


I noticed that each day the device slows down; a quick investigation 
discovered that Debian runs a "checkarray" script each night at 1 am 
(via cron). The essence of "checkarray" script is basically this:


echo check > /sys/block/$dev/md/sync_action

Which starts a resync of drives. As one can imagine, resync of 800 GB on 
a rather slow device (600 MHz ARM) can take 12 hours or so...



So my question is: is this "daily forced resync" necessary?

Perhaps in some cases, yes, because someone wrote that tool which does 
it daily.


On the other hand, if we consider Linux software RAID stable, such a 
resync would be only needed in some rare situations.


When can one need to run a "daily forced resync", and in which 
circumstances?



--
Tomasz Chmielewski
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html