Re: fsck_ufs out of swapspace

2011-12-20 Thread Kostik Belousov
On Tue, Dec 20, 2011 at 09:51:43AM +1100, Peter Jeremy wrote:
 On 2011-Dec-19 22:27:49 +0100, Michiel Boland bolan...@xs4all.nl wrote:
 Problem solved - it was indeed an endian thing.
 The problem is that fsck uses a real_dev_bsize variable that is declared 
 long, 
 but the DIOCGSECTORSIZE ioctl takes an u_int argument.
 
 To be accurate, this isn't an endian problem, it's a general problem
 of passing a pointer to an incorrectly sized object.  The bug is
 masked on amd64  iA64 because real_dev_bsize is statically allocated
 and therefore initialised to zero.  This means the failure to assign
 the top 32 bits in the ioctl doesn't affect the final result.
 
 A PR has been submitted.
 
 sparc64/163460 for the record.  Thank you for tracking that down.

The easier fix is to change the type of real_dev_bsize. I used long only
because other n variables keeping the sector size are long, but there
is no much reason to use long there.

Peter, would you, please retest the +J on non-512 byte sectors, with the
patch attached ?

diff --git a/sbin/fsck_ffs/fsck.h b/sbin/fsck_ffs/fsck.h
index 8091d0f..4e30a7e 100644
--- a/sbin/fsck_ffs/fsck.h
+++ b/sbin/fsck_ffs/fsck.h
@@ -268,7 +268,7 @@ charsnapname[BUFSIZ];   /* when doing 
snapshots, the name of the file */
 char   *cdevname;  /* name of device being checked */
 long   dev_bsize;  /* computed value of DEV_BSIZE */
 long   secsize;/* actual disk sector size */
-long   real_dev_bsize;
+u_int  real_dev_bsize; /* actual disk sector size, not overriden */
 char   nflag;  /* assume a no response */
 char   yflag;  /* assume a yes response */
 intbkgrdflag;  /* use a snapshot to run on an active system */
diff --git a/sbin/fsck_ffs/suj.c b/sbin/fsck_ffs/suj.c
index ec8b5ab..b784519 100644
--- a/sbin/fsck_ffs/suj.c
+++ b/sbin/fsck_ffs/suj.c
@@ -206,7 +206,7 @@ opendisk(const char *devnam)
real_dev_bsize) == -1)
real_dev_bsize = secsize;
if (debug)
-   printf(dev_bsize %ld\n, real_dev_bsize);
+   printf(dev_bsize %u\n, real_dev_bsize);
 }
 
 /*


pgpcm0dWM9HIP.pgp
Description: PGP signature


Re: fsck_ufs out of swapspace

2011-12-20 Thread Peter Holm
On Tue, Dec 20, 2011 at 11:48:33AM +0200, Kostik Belousov wrote:
 On Tue, Dec 20, 2011 at 09:51:43AM +1100, Peter Jeremy wrote:
  On 2011-Dec-19 22:27:49 +0100, Michiel Boland bolan...@xs4all.nl wrote:
  Problem solved - it was indeed an endian thing.
  The problem is that fsck uses a real_dev_bsize variable that is declared 
  long, 
  but the DIOCGSECTORSIZE ioctl takes an u_int argument.
  
  To be accurate, this isn't an endian problem, it's a general problem
  of passing a pointer to an incorrectly sized object.  The bug is
  masked on amd64  iA64 because real_dev_bsize is statically allocated
  and therefore initialised to zero.  This means the failure to assign
  the top 32 bits in the ioctl doesn't affect the final result.
  
  A PR has been submitted.
  
  sparc64/163460 for the record.  Thank you for tracking that down.
 
 The easier fix is to change the type of real_dev_bsize. I used long only
 because other n variables keeping the sector size are long, but there
 is no much reason to use long there.
 
 Peter, would you, please retest the +J on non-512 byte sectors, with the
 patch attached ?
 

No problems seen while testing on both i386 and amd64 with a malloc MD
disk, sector size of 4k and SUJ.

- Peter


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: fsck_ufs out of swapspace

2011-12-19 Thread Michiel Boland

Problem solved - it was indeed an endian thing.
The problem is that fsck uses a real_dev_bsize variable that is declared long, 
but the DIOCGSECTORSIZE ioctl takes an u_int argument.


A PR has been submitted.

Cheers
Michiel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: fsck_ufs out of swapspace

2011-12-19 Thread Peter Jeremy
On 2011-Dec-19 22:27:49 +0100, Michiel Boland bolan...@xs4all.nl wrote:
Problem solved - it was indeed an endian thing.
The problem is that fsck uses a real_dev_bsize variable that is declared long, 
but the DIOCGSECTORSIZE ioctl takes an u_int argument.

To be accurate, this isn't an endian problem, it's a general problem
of passing a pointer to an incorrectly sized object.  The bug is
masked on amd64  iA64 because real_dev_bsize is statically allocated
and therefore initialised to zero.  This means the failure to assign
the top 32 bits in the ioctl doesn't affect the final result.

A PR has been submitted.

sparc64/163460 for the record.  Thank you for tracking that down.

-- 
Peter Jeremy


pgp7m3HL1diGx.pgp
Description: PGP signature


fsck_ufs out of swapspace

2011-12-17 Thread Michiel Boland
FreeBSD 9.0-PRERELEASE locked up while into some heavy I/O and failed to shut 
down properly, so I had to power-cycle. After it came back up it said


Starting file system checks:
** SU+J Recovering /dev/ada0a
** Reading 33554432 byte journal from inode 4.
swap_pager: out of swap space
swap_pager_getswapspace(16): failed
pid 67 (fsck_ufs), uid 0, was killed: out of swap space
fsck: /dev/ada0a: Killed: 9
Script /etc/rc.d/fsck running
Unknown error; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!

The only way to continue was to do a full fsck (with no journal)

This is a Sun Blade 100 (sparc64) with 768M of RAM.
So the fsck is taking up all of this? That can't be right.

What can I do to troubleshoot this further?

Cheers
Michiel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: fsck_ufs out of swapspace

2011-12-17 Thread Paul Mather
On Dec 17, 2011, at 3:36 PM, Michiel Boland wrote:

 FreeBSD 9.0-PRERELEASE locked up while into some heavy I/O and failed to shut 
 down properly, so I had to power-cycle. After it came back up it said
 
 Starting file system checks:
 ** SU+J Recovering /dev/ada0a
 ** Reading 33554432 byte journal from inode 4.
 swap_pager: out of swap space
 swap_pager_getswapspace(16): failed
 pid 67 (fsck_ufs), uid 0, was killed: out of swap space
 fsck: /dev/ada0a: Killed: 9
 Script /etc/rc.d/fsck running
 Unknown error; help!
 ERROR: ABORTING BOOT (sending SIGTERM to parent)!
 
 The only way to continue was to do a full fsck (with no journal)
 
 This is a Sun Blade 100 (sparc64) with 768M of RAM.
 So the fsck is taking up all of this? That can't be right.
 
 What can I do to troubleshoot this further?


FWIW, I had this happen to me several weeks ago on FreeBSD/powerpc64 9-CURRENT. 
 I had to get the machine up and running so I simply abandoned use of SU+J and 
went back to using just UFS+SU.  (Not very helpful, I know, but there you go.)  
I figure it is likely to be some kind of endianness problem in the SU+J code, 
given the lack of complaints on FreeBSD/i386 and FreeBSD/amd64.

Cheers,

Paul.

PS: The system I was using is an Apple Xserve G5 with 4 GB of RAM and 5 GB of 
swap space.  As you say, surely fsck can't be using that much 
memory...___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org