Bug#526398: /etc/init.d/checkroot.sh: can cause serious data corruption if booting on, battery power

2009-05-01 Thread peter green
Would a compromise be possible? Something along the lines of doing 
urgent stuff (journal replays, checks of unclean unjournaled 
filesystems) but skipping the n days/mounts since last check- check 
forced checks when on battery? Does fsck have and options that would 
allow this or would they have to be added?






--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#526398: /etc/init.d/checkroot.sh: can cause serious data corruption if booting on, battery power

2009-05-01 Thread Zygo Blaxell
On Fri, May 01, 2009 at 06:35:03PM +0100, peter green wrote:
 Would a compromise be possible? Something along the lines of doing  
 urgent stuff (journal replays, checks of unclean unjournaled  
 filesystems) but skipping the n days/mounts since last check- check  
 forced checks when on battery? Does fsck have and options that would  
 allow this or would they have to be added?

That would be nice, but out of scope of this bug.  I'd like the data-loss
issues fixed first, then worry about enhancing life for ext3 users with
disks that are too big for their batteries.  ;-)

AFAICT the exceptional case is ext3, because all the other filesystems
I've used on Linux either have journals (and they work, so they don't need
fsck), or they don't (so they do need fsck).  Only ext3 (or more precisely
e2fsck) has this strange mode where it does fsck's at random even though
it has no evidence to believe they're necessary.

A simple solution to the ext3-specific part of the problem
is to have the Debian installer use 'tune2fs -c0 -i0' after
'mke2fs -j' when installing on laptops.  This eliminates ext3's
fscking-at-random-and-inconvenient-times behavior.  Users who are
concerned about running fsck on battery power should be directed to
run this command, instead of breaking initscripts for everyone else.
This solution would be a bug for the installer and/or the documentation
of rcS, and implementing the ext3-specific solution doesn't remove the
need to fix this bug (#526398) on initscripts.

There are some valid points in the ext3 documentation about the need to
do fsck's at regular intervals to check for past failures to maintain
data integrity in the filesystem; however, those apply equally to all
filesystems, and arguably should be implemented by the initscripts for all
local read/write filesystem types that have fsck tools--not buried in the
ext3-specific tools themselves.  Also, fsck only makes the filesystem
usable to store new data--it doesn't restore any of the data you've
probably lost if your storage subsystem has these sorts of problems.
I don't think this is a problem that initscripts should try to solve
beyond alerting the user to the fact that their storage subsystem is
lossy and needs serious debugging.

Ideally there should be a fsck-detection tool (possibly a flag supported
by modified versions of the existing fsck tools) which can, given a
filesystem, tell initscripts which of three states the filesystem is in:

1.  The filesystem is known to have errors, or lacks journalling
and was not cleanly umounted, and *must* be checked.  When
initscripts detects the machine is on battery power, it could
prompt the user for various options:

a.  check the filesystem and boot normally

b.  power off immediately

c.  (with root password) boot without checking filesystem

d.  run sulogin

e.  after a timeout, proceed with one of the above options
chosen by a config file (e.g. /etc/default/rcS).

2.  The filesystem is not known to have errors, but a policy
limit (mount count or days since last fsck) has been reached.
initscripts would not check the filesystem if on battery power
in this case, based on the assumption that the user will reboot
later on battery power to perform the advised fsck.

3.  The filesystem is known to be cleanly umounted (or recoverable
from journal) and no fsck is recommended.  initscripts may
always run fsck in this case, since fsck should exit quickly
(if it doesn't, that's either a bug in the fsck package or
the fsck-detection tool).  Some filesystems may not be able
to determine quickly if they were cleanly umounted, so maybe
initscripts might skip fsck entirely for those filesystems.

Of course, if /fastboot or /forcefsck are present, initscripts would
honor those as it does now.

Note that the above proposal requires work to either modify the various
filesystem-specific fsck tools packages so that initscripts can use
them, or initscripts could implement the currently ext3-specific maximum
mounts/days behavior itself.  Both of those are wishlist items, but this
bug (#526398) is a critical bug.



signature.asc
Description: Digital signature


Bug#526398: [Pkg-sysvinit-devel] Bug#526398: /etc/init.d/checkroot.sh: can cause serious data corruption if booting on, battery power

2009-05-01 Thread Henrique de Moraes Holschuh
On Fri, 01 May 2009, peter green wrote:
 Would a compromise be possible? Something along the lines of doing  

[...]

 urgent stuff (journal replays, checks of unclean unjournaled  
 filesystems) but skipping the n days/mounts since last check- check  
 forced checks when on battery? Does fsck have and options that would  
 allow this or would they have to be added?

They'd have to be added.  Fsck can't do it :-(  And adding anything to
fsck is _not_ a very easy or very fast process... for one, you need to
add it to the generic wrapper, to all filesystem-specific fsck's that
need it, and you need to make sure nothing will complain about it
instead of fsck'ing...

So, that feature will have to be ripped out.  People who don't want
filesystems being tested while on battery should:

1. Configure the filesystems to NOT ask for testing after n mounts or
x days since last fsck.

*and*

2. Not boot the laptop on battery when it has dirty filesystems in the
first place.

Ideas on how to make (1) be easily accessible to most users are
welcome.  Scripts to emulate the fsck after n mounts or x time by
doing a forced fsck (which could be subject to not on battery
constrains without any risks to the data) are also welcome.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh



-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#526398: /etc/init.d/checkroot.sh: can cause serious data corruption if booting on battery power

2009-04-30 Thread Zygo Blaxell
Package: initscripts
Version: 2.86.ds1-61
Severity: critical
File: /etc/init.d/checkroot.sh
Justification: causes serious data loss

I was rather horrified to watch my laptop boot with a dirty root
filesystem mounted read/write.  Upon further investigation, I discovered
that checkroot.sh and checkfs.sh are hardcoded to bypass filesystem
checks if AC power is not present.  This makes no sense.  

If a journalling filesystem has errors, it should not be mounted
read/write until those errors are corrected.  Non-journalling filesystems
always need fsck if they are umounted uncleanly, so they shouldn't be
mounted read/write without checking and possible correction either.
Both cases require fsck before mounting regardless of the power source.

Failing to fsck in either case can cause serious data loss, especially
if the filesystem's metadata falsely indicates occupied space is free
and the system is used for some time.  This can lead to duplicate
allocations between filesystem metadata and user data, which leads to
data loss, security problems, unintentional data disclosure, and worse.
Recovery from errors of this kind is nearly impossible without a good set
of backups handy.  Serious problems can remain undetected for sufficently
long periods of time that backups get corrupted as well.

The problem is even worse for laptops that are only rebooted due to
crashes, and only crash in the field while running on battery power.
Such machines may never run fsck until the corruption is sufficiently
bad that the machine is unusable.

I would propose that the battery power status should only be tested
in checkroot.sh and checkfs.sh if a configuration setting explicitly
permits it.  For example, a variable FSCKONBATTERY might be added to
/etc/default/rcS with these options:

yes - check filesystems regardless of battery status (ignore
on_ac_power entirely).  This should be the default.

no - don't check filesystems when on_ac_power returns false.
This is the current behavior.

The system should not corrupt data by default, which is why the default
I propose above is different from the current behavior.  

Installed systems which are upgrading from legacy versions of initscripts
might preserve the old behavior in accordance with the principle of least
surprise, but all new systems should be installed with the default set
as above.

I would argue that unexpected data corruption is a much bigger surprise
than fscks on battery, but other bugs filed against this package suggest
people actually prefer the broken behavior, and these people would
probably complain if we fixed it for them.




-- System Information:
Debian Release: 5.0.1
  APT prefers stable
  APT policy: (500, 'stable'), (189, 'testing'), (179, 'unstable'), (1, 
'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.28.4-zb64 (SMP w/4 CPU cores; PREEMPT)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/bash

Versions of packages initscripts depends on:
ii  debianutils  2.30Miscellaneous utilities specific t
ii  e2fsprogs1.41.3-1ext2/ext3/ext4 file system utiliti
ii  libc62.9-4   GNU C Library: Shared libraries
ii  lsb-base 3.2-20  Linux Standard Base 3.2 init scrip
ii  mount2.13.1.1-1  Tools for mounting and manipulatin
ii  sysvinit-utils   2.86.ds1-61 System-V-like utilities

Versions of packages initscripts recommends:
ii  psmisc22.6-1 Utilities that use the proc filesy

initscripts suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org