Bug#556610: Please do incremental checks every night instead of a full monthly one

2014-01-07 Thread Sergey B Kirpichev
On Mon, Jan 06, 2014 at 03:14:14PM +1100, NeilBrown wrote:
 It is very unlikely to have a positive effect.

Well, at least one - we can simplify the incremental check
script drastically.

 If it has any effect, it will significantly slow down any check/repair etc
 that is happening.
 
  
  I think, it would be nice to end (not pause) check if it's reached
  sync_max.  Perhaps, there is deep reasons why md's interface
  doesn't work in this way.  Neil, could you explan this a bit?
 
 There might be a reason to continue the resync.

Could you explain the reasons behind of this interface?

 If you want to end the resync, then have some program wait for
 sync_completed to reach sync_max, then write 'idle' to 'sync_action'.

Yes, I know.  But this solution looks too ugly to be a good interface
for shell-scripting.  That's why I asked the question above.

 If you (or someone here) want to write a general incremental check script
 then I think that is a great idea, but rather than treating it as a Debian
 thing, post the proposal to linux-r...@vger.kernel.org and get feedback and
 suggestions there and when it is ready we can include it in the upstream
 mdadm package.

Ok.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#556610: Please do incremental checks every night instead of a full monthly one

2014-01-05 Thread NeilBrown
On Wed, 25 Dec 2013 19:13:27 +0400 Sergey B Kirpichev skirpic...@gmail.com
wrote:

  The main issue which all proposed solutions share is when
  there's a large array, say, md0, and a small array, say,
  md1, both shares the same set of underlying disks, so md
  subystem will not check/repair them in parallel.  In this
  situation, we will never check md1 if checking md0 takes
  more time than we allow in a month (28 days).
 
 What do you think about suggested above solution
 (set sync_force_parallel to 1 during cronjobs)?  This workaround
 is implemented in the updated (attached) patch.
 
 See also: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=556610#74
 
 BTW, how bad is in general to set
 sync_force_parallel to 1 per default?  (Cc'd to Neil Brown.)

It is very unlikely to have a positive effect.
If it has any effect, it will significantly slow down any check/repair etc
that is happening.

 
 I think, it would be nice to end (not pause) check if it's reached
 sync_max.  Perhaps, there is deep reasons why md's interface
 doesn't work in this way.  Neil, could you explan this a bit?

There might be a reason to continue the resync.
If you want to end the resync, then have some program wait for
sync_completed to reach sync_max, then write 'idle' to 'sync_action'.

If you (or someone here) want to write a general incremental check script
then I think that is a great idea, but rather than treating it as a Debian
thing, post the proposal to linux-r...@vger.kernel.org and get feedback and
suggestions there and when it is ready we can include it in the upstream
mdadm package.

NeilBrown


 
  I'll think about it all more.
 
 Any news?



signature.asc
Description: PGP signature


Bug#556610: Please do incremental checks every night instead of a full monthly one

2013-12-25 Thread Sergey B Kirpichev
 The main issue which all proposed solutions share is when
 there's a large array, say, md0, and a small array, say,
 md1, both shares the same set of underlying disks, so md
 subystem will not check/repair them in parallel.  In this
 situation, we will never check md1 if checking md0 takes
 more time than we allow in a month (28 days).

What do you think about suggested above solution
(set sync_force_parallel to 1 during cronjobs)?  This workaround
is implemented in the updated (attached) patch.

See also: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=556610#74

BTW, how bad is in general to set
sync_force_parallel to 1 per default?  (Cc'd to Neil Brown.)

I think, it would be nice to end (not pause) check if it's reached
sync_max.  Perhaps, there is deep reasons why md's interface
doesn't work in this way.  Neil, could you explan this a bit?

 I'll think about it all more.

Any news?
--- /etc/cron.d/mdadm.orig	2013-12-25 19:00:14.0 +0400
+++ /etc/cron.d/mdadm	2013-12-25 19:01:50.0 +0400
@@ -5,8 +5,7 @@
 # distributed under the terms of the Artistic Licence 2.0
 #
 
-# By default, run at 00:57 on every Sunday, but do nothing unless the day of
-# the month is less than or equal to 7. Thus, only run on the first Sunday of
-# each month. crontab(5) sucks, unfortunately, in this regard; therefore this
-# hack (see #380425).
-57 0 * * 0 root if [ -x /usr/share/mdadm/checkarray ]  [ $(date +\%d) -le 7 ]; then /usr/share/mdadm/checkarray --cron --all --idle --quiet; fi
+# By default, start (or continue unfinished checks) at 00:57
+# and stop (interrupt) checks at 01:57.
+57 0 * * * root [ -x /usr/share/mdadm/checkarray ]  /usr/share/mdadm/checkarray --cron --all --idle --quiet
+57 1 * * * root [ -x /usr/share/mdadm/checkarray ]  /usr/share/mdadm/checkarray --cron --all --idle --quiet --interrupt
--- /usr/share/mdadm/checkarray.orig	2013-01-24 17:26:51.0 +0400
+++ /usr/share/mdadm/checkarray	2013-12-25 18:58:56.0 +0400
@@ -27,10 +27,12 @@
  -a|--all	check all assembled arrays (ignores arrays in command line).
  -s|--status	print redundancy check status of devices.
  -x|--cancel	queue a request to cancel a running redundancy check.
+  --interrupt   queue a request to interrupt a running redundancy check.
  -i|--idle	perform check in a lowest scheduling class (idle)
  -l|--slow	perform check in a lower-than-standard scheduling class
  -f|--fast	perform check in higher-than-standard scheduling class
  --realtime	perform check in real-time scheduling class (DANGEROUS!)
+ --split n  check next 1/n'th part (n = 28) of every specified device (override CHECK_SPLIT)
  -c|--cron	honour AUTOCHECK setting in /etc/default/mdadm.
  -q|--quiet	suppress informational messages
 		(use twice to suppress error messages too).
@@ -50,7 +52,7 @@
 }
 
 SHORTOPTS=achVqQsxilf
-LONGOPTS=all,cron,help,version,quiet,real-quiet,status,cancel,idle,slow,fast,realtime
+LONGOPTS=all,cron,help,version,quiet,real-quiet,status,cancel,interrupt,idle,slow,fast,realtime,split:
 
 eval set -- $(getopt -o $SHORTOPTS -l $LONGOPTS -n $PROGNAME -- $@)
 
@@ -62,20 +64,31 @@
 action=check
 ionice=
 
-for opt in $@; do
-  case $opt in
--a|--all) all=1;;
--s|--status) action=status;;
--x|--cancel) action=idle;;
--i|--idle) ionice=idle;;
--l|--slow) ionice=low;;
--f|--fast) ionice=high;;
---realtime) ionice=realtime;;
--c|--cron) cron=1;;
--q|--quiet) quiet=$(($quiet+1));;
--Q|--real-quiet) quiet=$(($quiet+2));;	# for compatibility
+while true
+do
+  case $1 in
+-a|--all) all=1; shift;;
+-s|--status) action=status; shift;;
+-x|--cancel) action=cancel; shift;;
+--interrupt) action=interrupt; shift;;
+-i|--idle) ionice=idle; shift;;
+-l|--slow) ionice=low; shift;;
+-f|--fast) ionice=high; shift;;
+--realtime) ionice=realtime; shift;;
+--split) CHECK_SPLIT=$2; shift 2;;
+-c|--cron) cron=1; shift;;
+-q|--quiet) quiet=$(($quiet+1)); shift;;
+-Q|--real-quiet) quiet=$(($quiet+2)); shift;; # for compatibility
 -h|--help) usage; exit 0;;
 -V|--version) about; exit 0;;
+--) shift; break;;
+*) echo $PROGNAME: E: invalid option: $1.  Try --help. 2; exit 1;;
+  esac
+done
+
+for opt in $@
+do
+  case $opt in
 /dev/md/*|md/*) arrays=${arrays:+$arrays }md${opt#*md/};;
 /dev/md*|md*) arrays=${arrays:+$arrays }${opt#/dev/};;
 /sys/block/md*) arrays=${arrays:+$arrays }${opt#/sys/block/};;
@@ -99,6 +112,20 @@
   exit 0
 fi
 
+CHECK_SPLIT=${CHECK_SPLIT:-28}
+
+if [ $CHECK_SPLIT -gt 28 ]
+then
+  CHECK_SPLIT=28
+  echo $PROGNAME: W: CHECK_SPLIT  28, reset to 28. 2
+fi
+
+if [ $CHECK_SPLIT -lt 1 ]
+then
+  CHECK_SPLIT=1
+  echo $PROGNAME: W: CHECK_SPLIT  1, reset to 1. 2
+fi
+
 if [ ! -f /proc/mdstat ]; then
   [ $quiet -lt 2 ]  echo $PROGNAME: E: MD subsystem not loaded, or /proc unavailable. 2
   exit 2
@@ -159,10 +186,34 @@
 continue
   fi
 
+  chunk_size=$(cat $MDBASE/chunk_size)
+  # set one to safe value if raid level 

Bug#556610: Please do incremental checks every night instead of a full monthly one

2012-06-22 Thread Michael Tokarev

Ok. I reviewed the patches and proposed solutions, but
I can't commit/implement any of them so far.

The main issue which all proposed solutions share is when
there's a large array, say, md0, and a small array, say,
md1, both shares the same set of underlying disks, so md
subystem will not check/repair them in parallel.  In this
situation, we will never check md1 if checking md0 takes
more time than we allow in a month (28 days).

I'll think about it all more.

Thanks,

/mjt



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#556610: Please do incremental checks every night instead of a full monthly one

2012-06-22 Thread Sergey B Kirpichev
On Fri, Jun 22, 2012 at 07:51:27PM +0300, Michael Tokarev wrote:
 The main issue which all proposed solutions share is when
 there's a large array, say, md0, and a small array, say,
 md1, both shares the same set of underlying disks, so md
 subystem will not check/repair them in parallel.  In this
 situation, we will never check md1 if checking md0 takes
 more time than we allow in a month (28 days).

Yep.  See my last post.

 I'll think about it all more.

What do you think about suggested above solution
(set sync_force_parallel to 1 during cronjobs)?

Another solution (except mentioned above pooling etc): don't delay
check for current array after reaching of the sync_max threshold.
Just *stop* it.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#556610: Please do incremental checks every night instead of a full monthly one

2011-12-17 Thread Sergey B Kirpichev
Just to note, the above patch wont work properly on squeeze kernel (That
is why you may need here black magick with watching sync_completed
file, as Alice suggests).

This is fixed in kernel since the commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c07b70ad32ed0a5ec9735cafb1aa10b3a2298b7d
Seems to be simple, but there is no chance to enter squeeze, right?

Attached checkarray (fixed typo) and cron.d/mdadm patches.


checkarray.patch
Description: Binary data


mdadm-cron.patch
Description: Binary data


Bug#556610: Please do incremental checks every night instead of a full monthly one

2011-12-16 Thread Sergey B Kirpichev
Attached slightly fixed version of the above
patch: sync_min must be a multiple of chunk_size.


checkarray.patch
Description: Binary data


Bug#556610: Please do incremental checks every night instead of a full monthly one

2011-12-06 Thread Sergey Kirpichev
tag 556610 +patch
thanks

Just a more simple version of the
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=32;filename=checkarray.diff;att=2;bug=556610

Rough idea is to
1) setup crontab on a regular basis, e.g. weekly:
--8---
57 0 * * 0 root [ -x /usr/share/mdadm/checkarray ]  
/usr/share/mdadm/checkarray --cron --all --quiet
57 6 * * 0 root [ -x /usr/share/mdadm/checkarray ]  
/usr/share/mdadm/checkarray --cron --all --quiet --cancel
-8---

2) Save sync_completed info to sync_min on --cancel:
---8-
--- /usr/share/mdadm/checkarray 2011-12-06 04:41:09.0 +0400
+++ ./checkarray2011-12-06 18:45:41.0 +0400
@@ -165,8 +165,10 @@
 
   case $action in
 idle)
+  completed=$(awk -F/ '{ if ($1 == none) {print 0} else {print $1}}' 
/sys/block/$array/md/sync_completed)
   echo $action  $SYNC_ACTION_CTL
   [ $quiet -lt 1 ]  echo $PROGNAME: I: cancel request queued for array 
$array. 2
+  echo $completed  /sys/block/$array/md/sync_min
   ;;
 
 check)
-8

Of course, it's easy to dump sync_completed state in
temporary files somewhere in /var/lib/mdadm/ to survive
on reboot.  I'm not sure if that is a good idea at all...
--- /usr/share/mdadm/checkarray	2011-12-06 04:41:09.0 +0400
+++ ./checkarray	2011-12-06 18:45:41.0 +0400
@@ -165,8 +165,10 @@
 
   case $action in
 idle)
+  completed=$(awk -F/ '{ if ($1 == none) {print 0} else {print $1}}' /sys/block/$array/md/sync_completed)
   echo $action  $SYNC_ACTION_CTL
   [ $quiet -lt 1 ]  echo $PROGNAME: I: cancel request queued for array $array. 2
+  echo $completed  /sys/block/$array/md/sync_min
   ;;
 
 check)
#!/bin/sh
#
# checkarray -- initiates a check run of an MD array's redundancy information.
#
# Copyright © martin f. krafft madd...@debian.org
# distributed under the terms of the Artistic Licence 2.0
#
set -eu

PROGNAME=${0##*/}

about()
{
  echo $PROGNAME -- MD array (RAID) redundancy checker tool
  echo Copyright © martin f. krafft madd...@debian.org
  echo Released under the terms of the Artistic Licence 2.0
}

usage()
{
  about
  echo
  echo Usage: $PROGNAME [options] [arrays]
  echo
  echo Valid options are:
  cat -_eof | column -s\ -t
-a|--all  check all assembled arrays (check /proc/mdstat).
-s|--status  print redundancy check status of devices.
-x|--cancel  queue a request to cancel a running redundancy check.
-i|--idle  perform check in a lowest I/O scheduling class (idle).
-l|--slow  perform check in a lower-than-standard I/O scheduling class.
-f|--fast  perform check in higher-than-standard I/O scheduling class.
--realtime  perform check in real-time I/O scheduling class 
(DANGEROUS!).
-c|--cron  honour AUTOCHECK setting in /etc/default/mdadm.
-q|--quiet  suppress informational messages.
-Q|--real-quiet  suppress all output messages, including warnings and 
errors.
-h|--help  show this output.
-V|--version  show version information.
_eof
  echo
  echo Examples:
  echo   $PROGNAME --all --idle
  echo   $PROGNAME --quiet /dev/md[123]
  echo   $PROGNAME -sa
  echo   $PROGNAME -x --all
  echo
  echo Devices can be specified in almost any format. The following are
  echo all equivalent:
  echo   /dev/md0, md0, /dev/md/0, /sys/block/md0
  echo
  echo The --all option overrides all arrays passed to the script.
  echo
  echo You can also control the status of a check with /proc/mdstat .
}

SHORTOPTS=achVqQsxilf
LONGOPTS=all,cron,help,version,quiet,real-quiet,status,cancel,idle,slow,fast,realtime

eval set -- $(getopt -o $SHORTOPTS -l $LONGOPTS -n $PROGNAME -- $@)

arrays=''
cron=0
all=0
quiet=0
status=0
action=check
ionice=

for opt in $@; do
  case $opt in
-a|--all) all=1;;
-s|--status) action=status;;
-x|--cancel) action=idle;;
-i|--idle) ionice=idle;;
-l|--slow) ionice=low;;
-f|--fast) ionice=high;;
--realtime) ionice=realtime;;
-c|--cron) cron=1;;
-q|--quiet) quiet=1;;
-Q|--real-quiet) quiet=2;;
-h|--help) usage; exit 0;;
-V|--version) about; exit 0;;
/dev/md/*|md/*) arrays=${arrays:+$arrays }md${opt#*md/};;
/dev/md*|md*) arrays=${arrays:+$arrays }${opt#/dev/};;
/sys/block/md*) arrays=${arrays:+$arrays }${opt#/sys/block/};;
--) :;;
*) echo $PROGNAME: E: invalid option: $opt 2; usage 2; exit 0;;
  esac
done

is_true()
{
  case ${1:-} in
[Yy]es|[Yy]|1|[Tt]rue|[Tt]) return 0;;
*) return 1;
  esac
}

DEBIANCONFIG=/etc/default/mdadm
[ -r $DEBIANCONFIG ]  . $DEBIANCONFIG
if [ $cron = 1 ]  ! is_true ${AUTOCHECK:-false}; then
  [ $quiet -lt 1 ]  echo $PROGNAME: I: disabled in $DEBIANCONFIG . 2
  exit 0
fi

if [ ! -f /proc/mdstat ]; then
  [ $quiet -lt 2 ]  echo $PROGNAME: E: MD subsystem not loaded, or /proc 
unavailable. 2
  exit 2
fi

if [ ! -d /sys/block ]; then
  [ $quiet -lt 2 ]  echo $PROGNAME: E: 

Bug#556610: Please do incremental checks every night instead of a full monthly one

2009-11-17 Thread martin f krafft
also sprach Goswin von Brederlow goswin-...@web.de [2009.11.17.0558 +0100]:
 Neil Brown recently explained on the linux-raid ML that one can do
 partial checks on a raid array:
 
 | If you first read from 'sync_completed' and store that value,
 | then before starting a new 'check', write the value to
 | sync_max, then you get exactly what you are asking for, all
 
 I assume he ment sync_min here.
 
 | easily done in a shell script.
 | You can also set 'sync_max' if you like, thus you could e.g.
 | quite easily have a cron job that scrubs 1/28th of the array each
 | night based on the day of the month.
 
 I think it would be a good idea to change the default check to run
 like this, a little every day or week with /etc/default/mdadm saying
 which of the two.

I like the idea but won't have the time to implement this anytime
soon. Patches welcome.

-- 
 .''`.   martin f. krafft madd...@d.o  Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)


Bug#556610: Please do incremental checks every night instead of a full monthly one

2009-11-16 Thread Goswin von Brederlow
Package: mdadm
Version: 3.0-2
Severity: wishlist

Hi,

Neil Brown recently explained on the linux-raid ML that one can do
partial checks on a raid array:

| If you first read from 'sync_completed' and store that value,
| then before starting a new 'check', write the value to
| sync_max, then you get exactly what you are asking for, all

I assume he ment sync_min here.

| easily done in a shell script.
| You can also set 'sync_max' if you like, thus you could e.g.
| quite easily have a cron job that scrubs 1/28th of the array each
| night based on the day of the month.

I think it would be a good idea to change the default check to run
like this, a little every day or week with /etc/default/mdadm saying
which of the two.

MfG
Goswin

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable-i386
  APT policy: (1001, 'unstable-i386'), (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.29.4-frosties-2 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/bash

Versions of packages mdadm depends on:
ii  debconf   1.5.27 Debian configuration management sy
ii  libc6 2.10.1-2   GNU C Library: Shared libraries
ii  lsb-base  3.2-23 Linux Standard Base 3.2 init scrip
ii  makedev   2.3.1-89   creates device files in /dev
ii  udev  0.141-1/dev/ and hotplug management daemo

Versions of packages mdadm recommends:
ii  exim4-daemon-heavy [mail-tran 4.69-11Exim MTA (v4) daemon with extended
ii  module-init-tools 3.9-2  tools for managing Linux kernel mo

mdadm suggests no packages.

-- debconf information excluded



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org