Bug#515653: uptimed: on brutal reboot, all records are lost
On Tue, Jan 18, 2011 at 8:28 AM, Laurent Bonnaud laurent.bonn...@inpg.fr wrote: Hi, I also experienced this bug several times on a laptop that sometimes fails to resume from suspend and with an ext4 filesystem. Here is a patch that should fix the problem: --- urec.c~ 2009-01-02 00:46:00.0 +0100 +++ urec.c 2011-01-18 08:07:28.886203152 +0100 @@ -263,6 +263,7 @@ if ((max 0) (++i = max)) break; } } + fflush(f); fclose(f); rename(FILE_RECORDS, FILE_RECORDS.old); rename(FILE_RECORDS.tmp, FILE_RECORDS); DESCRIPTION The fclose() function flushes the stream pointed to by fp (writing any buffered output data using fflush(3)) and closes the underlying file descriptor. For the records, as it's been explained before ( http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=536823#29 ) what's needed here to definitely fix this problem is added logic which would check upon opening the database that it's not empty, and if it is, that would discard it and use the backup one. For what it's worth, this behaviour is expected: when fclose() is hit, the data may still reside in the VFS cache. On journaled filesystem, under the most usual setups, only the metadata may be actually flushed. When a crash occurs, the journal will restore filesystem consistency by either removing (case covered by the use of the backup file) or zero-out (XFS will typically do that) files which are in an inconsistent state with the journal. The only way to prevent this would be to add a fsync() before every fclose(), which would force sync the data to disk. But then, I suggest reading the fsync(2) manpage to understand the implications in terms of performance impact and constant wakeup of harddrives for users who spin down their drives, overall impact which would be totally unacceptable for a bug that's caused by a non normal operation of the software. (NB: no, a crash or a power loss is not normal use of the system). It's a common misconception that people believe filesystems, and especially journaled ones, should be crash-proof. They're not. HTH T-Bone -- Thibaut VARENE http://www.parisc-linux.org/~varenet/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#515653: uptimed: on brutal reboot, all records are lost
Hi, I also experienced this bug several times on a laptop that sometimes fails to resume from suspend and with an ext4 filesystem. Here is a patch that should fix the problem: --- urec.c~ 2009-01-02 00:46:00.0 +0100 +++ urec.c 2011-01-18 08:07:28.886203152 +0100 @@ -263,6 +263,7 @@ if ((max 0) (++i = max)) break; } } + fflush(f); fclose(f); rename(FILE_RECORDS, FILE_RECORDS.old); rename(FILE_RECORDS.tmp, FILE_RECORDS); -- Laurent Bonnaud. http://www.lis.inpg.fr/pages_perso/bonnaud/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#515653: uptimed: on brutal reboot, all records are lost
Package: uptimed Version: 1:0.3.16-2 Severity: important Hello, I was in need to reboot brutally my box (due to a series of oops on kernel avoiding any action), but when system came up again, all uptime records were lost :( This is really really annoying... Thanks, Sandro -- System Information: Debian Release: 5.0 APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.25-2-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages uptimed depends on: ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy ii libc6 2.7-18 GNU C Library: Shared libraries ii libuptimed0 1:0.3.16-2 Library for uptimed uptimed recommends no packages. uptimed suggests no packages. -- debconf information: uptimed/mail/do_mail: Never uptimed/mail/address: r...@localhost uptimed/interval: 60 uptimed/mail/milestones_info: uptimed/maxrecords: 50 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#515653: uptimed: on brutal reboot, all records are lost
severity 515653 normal tags 515653 moreinfo thanks On Mon, Feb 16, 2009 at 7:03 PM, Sandro Tosi mo...@debian.org wrote: Package: uptimed Version: 1:0.3.16-2 Severity: important Hello, I was in need to reboot brutally my box (due to a series of oops on kernel avoiding any action), but when system came up again, all uptime records were lost :( I don't get it. This was supposed to be fixed with the current version of uptimed. What filesystem are you using on /var? Were there any error messages when uptimed started? What's the content of /var/spool/uptimed/records* ? Thanks -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#515653: uptimed: on brutal reboot, all records are lost
Please don't remove the BTS from the CC-list. Bug report information must be recorded. Le 16 févr. 09 à 19:26, Sandro Tosi a écrit : On Mon, Feb 16, 2009 at 19:19, Thibaut VARENE vare...@debian.org wrote: What filesystem are you using on /var? xfs By default, XFS will zero-out inconsistent files after an unclean mount. If that's what happened, it's likely uptimed tried to use the (invalid) content of the file instead of its backup, which is why you didn't see anything in the log (when it uses its backup database it prints a message). Radek, I don't really know what to do about this. Working around filesystem issues is gonna be a burden. Adding supplementary checks to assert the validity of the file being read doesn't look really straightforward; do you have any suggestion about this? The way I see it, we could bail out in urec.c:231, and 1) give feedback on failure to read record entry, while 2) falling back to the backup database on such a failure... Of course, if the backup db is also damaged, we're doomed. What's the content of /var/spool/uptimed/records* ? $ for file in /var/spool/uptimed/records* ; do echo --- $file --- ; cat $file ; done --- /var/spool/uptimed/records --- 35228:1234767249:Linux 2.6.25-2-amd64 5978:1234802550:Linux 2.6.25-2-amd64 --- /var/spool/uptimed/records.old --- 35228:1234767249:Linux 2.6.25-2-amd64 5918:1234802550:Linux 2.6.25-2-amd64 That's ok. -- Thibaut VARÈNE http://www.parisc-linux.org/~varenet/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#515653: uptimed: on brutal reboot, all records are lost
On Mon, Feb 16, 2009 at 20:02, Thibaut VARÈNE vare...@debian.org wrote: Please don't remove the BTS from the CC-list. Bug report information must be recorded. Didn't mean to remove Bts address: click on Reply instead of Reply to all bt mistake. Le 16 févr. 09 à 19:26, Sandro Tosi a écrit : On Mon, Feb 16, 2009 at 19:19, Thibaut VARENE vare...@debian.org wrote: What filesystem are you using on /var? xfs By default, XFS will zero-out inconsistent files after an unclean mount. If that's what happened, it's likely uptimed tried to use the (invalid) content of the file instead of its backup, which is why you didn't see anything in the log (when it uses its backup database it prints a message). That might be what happened. Radek, I don't really know what to do about this. Working around filesystem issues is gonna be a burden. Adding supplementary checks to assert the validity of the file being read doesn't look really straightforward; do you have any suggestion about this? The way I see it, we could bail out in urec.c:231, and 1) give feedback on failure to read record entry, while 2) falling back to the backup database on such a failure... Of course, if the backup db is also damaged, we're doomed. Fall back on the backup seems a smart move, of course in case both are corrupted, starting from scratch is the only option at hand. Regards, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org