Re: Reason why "nocache" option is not displayed in "mount"?

2024-03-10 Thread Kirk McKusick
> Date: Sun, 10 Mar 2024 19:21:54 +0200
> From: Konstantin Belousov 
> To: Kirk McKusick 
> Cc: curr...@freebsd.org
> Subject: Re: Reason why "nocache" option is not displayed in "mount"?
> 
> On Sun, Mar 10, 2024 at 01:53:05AM +, Kirk McKusick wrote:
>> The issue has to do with how flags are defined in mount.h.
>> Specifically there are the flags that are externally visible
>> (prefixed with MNT_) and those that are for internal use
>> (prefixed with MNTK_, the K standing for KERNEL). If it
>> is desirable to have MNTK_NULL_NOCACHE visible, then it
>> should be renamed to MNT_NULL_CACHE, added to MNT_VISFLAGMASK,
>> and listed in MNTOPT_NAMES. It probably belongs in the set
>> described as `Flags set by internal operations, but visible
>> to the user.' With this change, it will be displayed by
>> the mount command and show up in the statfs flags.
> 
> There is no MNTK_NULL_NOCACHE flag in mnt_kern_flags.
> 
> When userspace communicates the "cache" or "nocache" option to the
> VFS_MOUNT() op for nullfs, it passes plain C string using the nmount(2)
> system call. The strings are explicitly queried by nullfs_mount(), mixed
> with the "default" sysctl, and then the nullfs-mount specific data flag
> is set, in mp->mnt_data.null_flag.
> 
> There is no space in the struct statfs for ABI extension.
> The getfsstat(2) system call cannot report arbitrary fs-specific options.
> 
> If somebody wants to uniformilly report fs-specific options, instead of
> scattered fs-specific hacks like MNT_SOFTDEP/MNT_GJOURNAL (UFS) and
> nfsstat -m (nfsclient), then some extension for nmount(2) is due,
> say MNT_QUERY_OP, which should be passed down to VFS_MOUNT() and back.

As you note there are some filesystem specific flags already in
mnt_flag that get copied to the statfs f_flags field. My point is
that the NOCACHE flag could be moved to mnt_flag and made visible
in the f_flags field. While it is currently specific to nullfs, it
might be useful to implement it in other filesystems.

Kirk McKusick



Re: Reason why "nocache" option is not displayed in "mount"?

2024-03-09 Thread Kirk McKusick
The issue has to do with how flags are defined in mount.h.
Specifically there are the flags that are externally visible
(prefixed with MNT_) and those that are for internal use
(prefixed with MNTK_, the K standing for KERNEL). If it
is desirable to have MNTK_NULL_NOCACHE visible, then it
should be renamed to MNT_NULL_CACHE, added to MNT_VISFLAGMASK,
and listed in MNTOPT_NAMES. It probably belongs in the set
described as `Flags set by internal operations, but visible
to the user.' With this change, it will be displayed by
the mount command and show up in the statfs flags.

Kirk McKusick



Re: fsck strange output

2021-01-26 Thread Kirk McKusick
> From: Rozhuk Ivan 
> Date: Tue, 26 Jan 2021 17:08:41 +0300
> To: Kirk McKusick 
> Cc: freebsd-current@freebsd.org
> Subject: Re: fsck strange output
> 
> On Mon, 25 Jan 2021 15:40:12 -0800
> Kirk McKusick  wrote:
> 
> 
>> Please try this patch to fsck_ffs and see if it fixes your problem.
>> 
>>  Kirk McKusick
>> 
>> =-=-=
>> 
>> *** sbin/fsck_ffs/inode.c.orig   2021-01-07 15:04:04.969086284
>> -0800 --- sbin/fsck_ffs/inode.c  2021-01-25 15:29:06.404803358
>> -0800 ***
>> *** 611,618 
>>  sizeof(struct ufs1_dinode) : sizeof(struct
>> ufs2_dinode)); readpercg = inosused / fullcnt;
>>  partialcnt = inosused % fullcnt;
>> !partialsize = partialcnt * ((sblock.fs_magic ==
>> FS_UFS1_MAGIC) ? !   sizeof(struct ufs1_dinode) :
>> sizeof(struct ufs2_dinode)); if (partialcnt != 0) {
>>  readpercg++;
>>  } else {
>> --- 611,619 
>>  sizeof(struct ufs1_dinode) : sizeof(struct
>> ufs2_dinode)); readpercg = inosused / fullcnt;
>>  partialcnt = inosused % fullcnt;
>> !partialsize = fragroundup(,
>> !partialcnt * ((sblock.fs_magic == FS_UFS1_MAGIC) ?
>> !sizeof(struct ufs1_dinode) : sizeof(struct
>> ufs2_dinode))); if (partialcnt != 0) {
>>  readpercg++;
>>  } else {
> 
> 
> https://github.com/rozhuk-im/freebsd/commit/5e8bfa01830e2b6ecb88e572064c6fffe5a2df2d
> (if I apply correct :) )
> 
> With this patch - seems no errors, thanks!

Thanks for your testing. It has also corrected the same problem in
Peter Holm's test suite. So, I am committed it as 8c22cf9. I will
also ensure that it gets MFC'ed into 13.0.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fsck strange output

2021-01-25 Thread Kirk McKusick
> From: Rozhuk Ivan 
> Date: Mon, 25 Jan 2021 23:29:33 +0300
> To: freebsd-current@freebsd.org
> Cc: Rozhuk Ivan 
> Subject: fsck strange output
> 
> Hi!
> 
> I am on fresh 13 and on auto fsck got:
> 
> Jan 25 23:14:13 3des kernel: Starting file system checks:
> Jan 25 23:14:13 3des kernel: /dev/gptid/81241708-8948-11e9-b1ae-049226c061d6: 
> CANNOT READ BLK: 11072
> Jan 25 23:14:13 3des kernel: /dev/gptid/81241708-8948-11e9-b1ae-049226c061d6: 
> UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.
> Jan 25 23:14:13 3des kernel: File system preen failed, trying fsck -y -T 
> ffs:-R,-r -T ufs:-R,-r
> Jan 25 23:14:13 3des kernel: ** 
> /dev/gptid/81241708-8948-11e9-b1ae-049226c061d6
> Jan 25 23:14:13 3des kernel: ** Last Mounted on /
> Jan 25 23:14:13 3des kernel: ** Root file system
> Jan 25 23:14:13 3des kernel: ** Phase 1 - Check Blocks and Sizes
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CANNOT READ BLK: 11072
> Jan 25 23:14:13 3des kernel: UNEXPECTED SOFT UPDATE INCONSISTENCY
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CONTINUE? yes
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: THE FOLLOWING DISK SECTORS COULD NOT BE READ:
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CANNOT READ BLK: 5129280
> Jan 25 23:14:13 3des kernel: UNEXPECTED SOFT UPDATE INCONSISTENCY
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CONTINUE? yes
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: THE FOLLOWING DISK SECTORS COULD NOT BE READ:
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CANNOT READ BLK: 6411520
> Jan 25 23:14:13 3des kernel: UNEXPECTED SOFT UPDATE INCONSISTENCY
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CONTINUE? yes
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: THE FOLLOWING DISK SECTORS COULD NOT BE READ:
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CANNOT READ BLK: 7693888
> Jan 25 23:14:13 3des kernel: UNEXPECTED SOFT UPDATE INCONSISTENCY
> Jan 25 23:14:13 3des kernel: 
> Jan 25 23:14:13 3des kernel: CONTINUE? yes
> 
> 
> Disk is 100% alive, got same on other HW.
> fsck -y - have no this strange problem with reading.
> 
> Is it OK "CANNOT READ BLK ..." ?
> 
> 
> >From my rc.conf:
> fsck_y_enable="YES"   # Set to YES to do fsck -y if the initial preen 
> fails.
> fsck_y_flags="-T ffs:-R,-r -T ufs:-R,-r" # Additional flags for fsck -y
> background_fsck="NO"  # Attempt to run fsck in the background where 
> possible.

Please try this patch to fsck_ffs and see if it fixes your problem.

Kirk McKusick

=-=-=

*** sbin/fsck_ffs/inode.c.orig  2021-01-07 15:04:04.969086284 -0800
--- sbin/fsck_ffs/inode.c   2021-01-25 15:29:06.404803358 -0800
***
*** 611,618 
sizeof(struct ufs1_dinode) : sizeof(struct ufs2_dinode));
readpercg = inosused / fullcnt;
partialcnt = inosused % fullcnt;
!   partialsize = partialcnt * ((sblock.fs_magic == FS_UFS1_MAGIC) ?
!   sizeof(struct ufs1_dinode) : sizeof(struct ufs2_dinode));
if (partialcnt != 0) {
readpercg++;
} else {
--- 611,619 
sizeof(struct ufs1_dinode) : sizeof(struct ufs2_dinode));
readpercg = inosused / fullcnt;
partialcnt = inosused % fullcnt;
!   partialsize = fragroundup(,
!   partialcnt * ((sblock.fs_magic == FS_UFS1_MAGIC) ?
!   sizeof(struct ufs1_dinode) : sizeof(struct ufs2_dinode)));
if (partialcnt != 0) {
readpercg++;
} else {
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r359627 is panicked with 'softdep_setup_blkfree: not free'

2020-04-09 Thread Kirk McKusick
> Date: Fri, 10 Apr 2020 11:17:39 +0900 (JST)
> To: freebsd-current@freebsd.org
> Cc: mckus...@mckusick.com
> Subject: Re: r359627 is panicked with 'softdep_setup_blkfree: not free'
> From: Masachika ISHIZUKA 
> 
>>>>>>>>>>I'm using r359627M. (r359627 with mount_udf2).
>>>>>>>>>>It is panicked with 'softdep_setup_blkfree: not free'.
>>>> 
>>>> Could you please try this workaround to see if it clears your panic:
>>>> 
>   [snip]
>>>> 
>>>> +  journaltrunc = 0;
>> 
>> Thanks for checking and sorry it did not help.
>> 
>> If you can bisect to find the commit that caused the problem to show
>> up or if you find a way to trigger the panic that would be very helpful
>> in tracking it down.
> 
>   Hi.
> 
>   I applied the above patch to r359627M and occured panic.
>   This was r359668 was not applied.
> 
>   I made buildkernel on r359766M and this looks work fine.
>   Thank you very much for your effort.
> 
> # The panic was occured to open/close tabs on firefox.
> -- 
> Masachika ISHIZUKA

Sounds like r359766M solved the problem. Thanks for your report.
Glad to hear that everything is back to working.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r359627 is panicked with 'softdep_setup_blkfree: not free'

2020-04-08 Thread Kirk McKusick
> Date: Wed, 08 Apr 2020 09:53:04 +0900 (JST)
> To: freebsd-current@freebsd.org
> Subject: Re: r359627 is panicked with 'softdep_setup_blkfree: not free'
> From: Masachika ISHIZUKA 
> 
> >>>>>I'm using r359627M. (r359627 with mount_udf2).
> >>>>>It is panicked with 'softdep_setup_blkfree: not free'.
> > 
> > Recently I've started observing issues like this too. For example, today
> 
>   As it was good working on r359197M, I'm booted old kernel(r359197M).
> % uname -UKa
> FreeBSD carrot.ish.org 13.0-CURRENT FreeBSD 13.0-CURRENT #1 r359197M: Sun Mar 
> 22 12:34:40 JST 2020 
> ishiz...@okra.ish.org:/usr/altlocal/freebsd-current/obj/usr/altlocal/freebsd-current/src/amd64.amd64/sys/GENERIC
>   amd64 1300084 1300088
> 
>   Although it is mismatch with kernel version and userland version,
> it's good working.
>   I'll use this old kernel for a while.
> -- 
> Masachika ISHIZUKA

Could you please try this workaround to see if it clears your panic:

*** sys/ufs/ffs/ffs_inode.c Wed Apr  8 21:57:31 2020
--- sys/ufs/ffs/ffs_inode.c.new Wed Apr  8 22:33:36 2020
***
*** 244,249 
--- 244,250 
needextclean = 0;
softdeptrunc = 0;
journaltrunc = DOINGSUJ(vp);
+   journaltrunc = 0;
if (journaltrunc == 0 && DOINGSOFTDEP(vp) && length == 0)
softdeptrunc = !softdep_slowdown(vp);
extblocks = 0;

Thanks,

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] Typo in hastd.8 man page

2019-12-21 Thread Kirk McKusick
> Date: Sat, 21 Dec 2019 12:03:47 -0800
> From: Steve Kargl 
> To: freebsd-current@freebsd.org
> Subject: [PATCH] Typo in hastd.8 man page
> 
> Path should explain the issue.
> 
> Index: hastd.8
> ===
> --- hastd.8   (revision 355983)
> +++ hastd.8   (working copy)
> @@ -44,7 +44,7 @@
>  daemon is responsible for managing highly available GEOM providers.
>  .Pp
>  .Nm
> -allows the transpaent storage of data on two physically separated machines
> +allows the transparent storage of data on two physically separated machines
>  connected over a TCP/IP network.
>  Only one machine (cluster node) can actively use storage provided by
>  .Nm .
> 
> -- 
> Steve

Fixed in -r355995, thanks.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fsync: giving up on dirty, umount -f fails

2019-10-24 Thread Kirk McKusick
> From: "Bjoern A. Zeeb" 
> To: "FreeBSD Current" 
> Subject: fsync: giving up on dirty, umount -f fails
> Date: Thu, 24 Oct 2019 07:58:39 +
> 
> Hi,
> 
> I am archiving some old disks and while trying to umount [-f] them I am 
> getting errors and I basically cannot get rid of the mount anymore 
> without rebooting.  This is on a HEAD from mid-end-August (around 
> r351518M).
> 
> Given there is a lot of work going on at the moment to deal with 
> “disks dropping out by error” and not to panic I was just wondering 
> if this is something to address as well?   Somehow umount -f should be 
> able to succeed (in the future)?
> 
> 
> fsync: giving up on dirty (error = 5) 
> g_vfs_done():da0s2g[READ(offset=4666441728, length=16384)]error = 5
> 0xf803533b81e0: tag devfs, type VCHR
>  usecount 1, writecount 0, refcount 1661 rdev 0xf8015372a800
>  flags (VI_ACTIVE)
>  v_object 0xf80365537c00 ref 0 pages 8340 cleanbuf 1561 dirtybuf 97
>  lock type devfs: EXCL by thread 0xf80006a57000 (pid 26526, 
>  umount, tid 100091)
>  dev da0s2g
> 
> /bz

In the above example the unmount is failing because it is getting
back EIO for one of its dirty buffers. Thus it is not able to get
everything written out, so refuses to do the unmount.

What we are working on doing is implementing a `very forcible'
unmount (which I would love to specify using `umount -F', but
regretably -F is already in use to specify an alternate fstab file).
A very forcible unmount says to simply abandon dirty buffers that
it cannot write.  In the event of a disk dying, that would be all
of the dirty buffers.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CURRENT: supeblock hash failure - CURRENT wrecking disks

2019-10-13 Thread Kirk McKusick
> Date: Wed, 7 Aug 2019 10:37:29 +0200
> From: "O. Hartmann" 
> To: freebsd-current 
> Subject: CURRENT: supeblock hash failure - CURRENT wrecking disks
> 
> Hello,
> 
> Today I ran into a ctastrophy with r350671. After installing a fresh
> compiled system and rebooted the box, UEFI loader dropped a bunch
> of errors, like some hex numbers stating, that a hash/superblock
> has is wrong and then the booting stopped at the OK loader prompt.
> 
> Rebooting the machine with the FreeBSD-13-CURRENT image from 1st
> August 2019 and trying to fsck the filesystem(s) on the boot SSD
> (UFS2, journaling and trim on), lots of unresolved block errors
> occured. But that didn't help much.  Further, after several checks,
> I saw some commits to the ffs code recently adn tried to restore a
> copy of the superblock of each filesystem (in contrary to the man
> page for fsck_ufs, the first backup superblock resides in 192, not
> 160!). But things then get even worse, it seems the whole /boot
> structure is corrupted, the loader can not find the recent kernel
> and kernel.old is crashing.
> 
> What's wrong here :-(
> 
> The box in question has been setup 6 weeks ago with FreeBSD 13-CURRENT
> natively. It is now a wreck. Other systems running CURRENT (as of
> the most recent revision as of today) were partially installed as
> 12-STABLE/12-CURRENT and "moved on" to what is now 13-CURRENT. They
> do not(!) indicate such problems reported.
> 
> Either I hit the crap installing a new system whilst there was a
> problem, or something really strange happened.
> 
> The bad thing is that kernel.old exits/dies with an exception and
> /boot/kernel/ seems to be completely corrupted. Tomorrow I try to
> install a prepared pkg tar arcive FreeBSD-kernel from a CURRENT pkg
> base and hope this will fix the issue.
> 
> Regards,
> 
> oh

The boot code checks the superblock hash and reports if it is wrong,
but ignores the error and continues to try and boot. The reason to
continue is to allow the system to come up so that the superblock
check hash can be fixed by running fsck. So your filesystem had
something more seriously wrong than just a bad superblock hash if
it could not be booted.

The fix in r350671 was to recompute the superblock check hash in a
place that I had missed earlier. I discovered the error when someone
reported getting superblock check hash errors when booting. But that
error did not cause their system to be unbootable for the reasons
that I explained in the previous paragraph.

If the filesystem started on 12-stable, then moving to 13 would not
have enabled superblock check hashes. They are only enabled when you
run fsck manually and explicitly say yes to the request to add superblock
check hashes. Running fsck -y will not add them, only when you run fsck
and explicitly respond yes to the superblock check hash addition request.
Filesystems created on 13 will get superblock check hashs. But if you
boot a 13 filesystem using a 12-stable kernel, they will be disabled and
left disabled even if you boot the filesystem on 13 again.

Thanks for pointing out the error on the fsck_ufs manual page. The first
backup superblock moved from 160 to 192 when the default block size was
raised from 16K to 32K. I have corrected the page in r350682.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CURRENT: supeblock hash failure - CURRENT wrecking disks

2019-10-13 Thread Kirk McKusick
> To: Enji Cooper 
> cc: "O. Hartmann" ,
> freebsd-current , mckus...@mckusick.com
> Subject: Re: CURRENT: supeblock hash failure - CURRENT wrecking disks
> From: "Poul-Henning Kamp" 
> 
> In message <39fb31e6-a8ec-484c-b297-39c19a787...@gmail.com>, Enji Cooper 
> writes
> :
> 
> There is an "interesting" failure-mechanism when you move a disk
> between 13/current and older systems which do not support ufs-hashes.
> 
> It will be prudent to make 11 and 12 clear the "use hashes" flags
> in the superblocks of all filesystems they mount R/W, to limit
> the amount havoc this will cause when people start playing with 13.
> 
> -- 
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> p...@freebsd.org | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.

Both stable-11 and stable-12 clear the "use hashes" flags. If the disk
is moved back to a 13-head system they remain disabled until reenabled
by running fsck in interactive mode and requesting that they be enabled.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Checking out the CSRG repository?

2019-06-19 Thread Kirk McKusick
> From: Alan Somers 
> Date: Wed, 19 Jun 2019 14:12:21 -0600
> Subject: Checking out the CSRG repository?
> To: FreeBSD CURRENT 
> 
> Does anybody know how to check out a local copy of the CSRG
> repository?  I can view it with ViewVC, but I would really like local
> access.  It doesn't seem to be available on the usual repo.FreeBSD.org
> or svn.FreeBSD.org.
> 
> $ svn checkout https://svn.FreeBSD.org/csrg csrg
> svn: E170013: Unable to connect to a repository at URL
> 'https://svn.freebsd.org/csrg'
> svn: E175009: The XML response contains invalid XML
> svn: E130003: Malformed XML: no element found at line 1
> 
> $ svn co svn+ssh://asom...@repo.freebsd.org/csrg csrg
> svn: E170013: Unable to connect to a repository at URL
> 'svn+ssh://asom...@repo.freebsd.org/csrg'
> svn: E210005: No repository found in 'svn+ssh://asom...@repo.freebsd.org/csrg'
> 
> -Alan
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

You can browse the history at http://svnweb.freebsd.org/csrg/

The repository is also available via FTP:

 ftp://ftp.freebsd.org/pub/FreeBSD/development/CSRG/csrg_svn.tbz

Hope this helps,

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic in getblkx() booting from disc1.iso in Qemu VM

2018-12-20 Thread Kirk McKusick
Thanks Rebecca for the report and Mark for the analysis of the problem.
This should be fixed in -r342290.

Kirk McKusick

=-=-=

From:Kirk McKusick 
Date:Fri, 21 Dec 2018 01:09:25 + (UTC)
To:  src-committ...@freebsd.org, svn-src-...@freebsd.org,
 svn-src-h...@freebsd.org
Subject: svn commit: r342290 - head/sys/kern

Author: mckusick
Date: Fri Dec 21 01:09:25 2018
New Revision: 342290
URL: https://svnweb.freebsd.org/changeset/base/342290

Log:
  Some filesystems (like cd9660 and ext3) require that VFS_STATFS()
  be called before VFS_ROOT() is called. Move the call for VFS_STATFS()
  so that it is done after VFS_MOUNT(), but before VFS_ROOT().
  This change actually improves the robustness of the mount system
  call because it returns an error rather than failing silently
  when VFS_STATFS() returns failure.
  
  Reported by:  Rebecca Cran 
  Sponsored by: Netflix

Modified:
  head/sys/kern/vfs_mount.c

Modified: head/sys/kern/vfs_mount.c
==
--- head/sys/kern/vfs_mount.c   Thu Dec 20 22:39:58 2018(r342289)
+++ head/sys/kern/vfs_mount.c   Fri Dec 21 01:09:25 2018(r342290)
@@ -895,6 +895,7 @@ vfs_domount_first(
 */
error1 = 0;
if ((error = VFS_MOUNT(mp)) != 0 ||
+   (error1 = VFS_STATFS(mp, >mnt_stat)) != 0 ||
(error1 = VFS_ROOT(mp, LK_EXCLUSIVE, )) != 0) {
if (error1 != 0) {
error = error1;
@@ -916,7 +917,6 @@ vfs_domount_first(
vfs_freeopts(mp->mnt_opt);
mp->mnt_opt = mp->mnt_optnew;
*optlist = NULL;
-   (void)VFS_STATFS(mp, >mnt_stat);
 
/*
 * Prevent external consumers of mount options from reading mnt_optnew.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: enabling inode hashes results in kernel panics

2018-12-13 Thread Kirk McKusick
As the person resposible for ading the inode check-hashes, sorry
for the problems that they are causing you.

Gary, I may want your crash dumps and the core.txt files, but let
me do some preliminary investigation of fsck to see if I can figure
out why it is failing to fix the inode check-hashes.

Are you running with soft updates or journaled soft updates?

Is the problem that fsck is not finding that are are bad check-hashes,
or is fsck finding them and not fixing them? Since fsck and fsdb share
the same code for reading and updating inodes, it is odd that fsdb
fixes the check-hashes, but the same code running in fsck does not.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: TRIM Consolodation on UFS/FFS filesystems

2018-08-20 Thread Kirk McKusick
From: Kirk McKusick 
To: FreeBSD Current ,
FreeBSD Filesystems 
Subject: CFT: TRIM Consolodation on UFS/FFS filesystems
Date: Mon, 20 Aug 2018 12:40:56 -0700

Oops, forgot that attachments get stripped. Below are the diffs for
gathering statistics. Sorry to those of you on Gmail for whom they
will be mangled.

Kirk McKusick

=-=-=

Index: sbin/mount/mount.c
===
--- sbin/mount/mount.c  (revision 338054)
+++ sbin/mount/mount.c  (working copy)
@@ -686,6 +686,18 @@ prmount(struct statfs *sfp)
for (i = 0; i < sizeof(sfp->f_fsid); i++)
printf("%02x", ((u_char *)>f_fsid)[i]);
}
+   if (sfp->f_trim_total != 0 || sfp->f_trim_total_blks != 0)
+   (void)printf(", TRIM: total %ju total blocks %ju",
+   (uintmax_t)sfp->f_trim_total,
+   (uintmax_t)sfp->f_trim_total_blks);
+   if (sfp->f_trim_inflight != 0 || sfp->f_trim_inflight_blks != 0)
+   (void)printf(", TRIM: inflight %ju inflight blocks %ju",
+   (uintmax_t)sfp->f_trim_inflight,
+   (uintmax_t)sfp->f_trim_inflight_blks);
+   if (sfp->f_pendingblks != 0 || sfp->f_pendingfiles != 0)
+   (void)printf(", pending blocks %ju, pending files %ju",
+   (uintmax_t)sfp->f_pendingblks,
+   (uintmax_t)sfp->f_pendingfiles);
}
(void)printf(")\n");
 }
Index: sys/sys/mount.h
===
--- sys/sys/mount.h (revision 338054)
+++ sys/sys/mount.h (working copy)
@@ -85,7 +85,13 @@ struct statfs {
uint64_t f_asyncwrites; /* count of async writes since mount */
uint64_t f_syncreads;   /* count of sync reads since mount */
uint64_t f_asyncreads;  /* count of async reads since mount */
-   uint64_t f_spare[10];   /* unused spare */
+   uint64_t f_trim_total;  /* count of TRIM ops since mount */
+   uint64_t f_trim_total_blks; /* count of TRIM blocks since mount */
+   uint64_t f_trim_inflight;   /* count of TRIM ops in progress */
+   uint64_t f_trim_inflight_blks;  /* count of TRIM blocks in progress */
+   int64_t  f_pendingblks; /* pending free blocks */
+   int64_t  f_pendingfiles;/* pending free nodes */
+   uint64_t f_spare[4];/* unused spare */
uint32_t f_namemax; /* maximum filename length */
uid_t f_owner;  /* user that mounted the filesystem */
fsid_tf_fsid;   /* filesystem id */
Index: sys/ufs/ffs/ffs_vfsops.c
===
--- sys/ufs/ffs/ffs_vfsops.c(revision 338081)
+++ sys/ufs/ffs/ffs_vfsops.c(working copy)
@@ -1398,7 +1398,13 @@ ffs_statfs(mp, sbp)
sbp->f_bsize = fs->fs_fsize;
sbp->f_iosize = fs->fs_bsize;
sbp->f_blocks = fs->fs_dsize;
+   sbp->f_pendingblks = dbtofsb(fs, fs->fs_pendingblocks);
+   sbp->f_pendingfiles = fs->fs_pendinginodes;
UFS_LOCK(ump);
+   sbp->f_trim_total = ump->um_trim_total;
+   sbp->f_trim_total_blks = ump->um_trim_total_blks;
+   sbp->f_trim_inflight = ump->um_trim_inflight;
+   sbp->f_trim_inflight_blks = ump->um_trim_inflight_blks;
sbp->f_bfree = fs->fs_cstotal.cs_nbfree * fs->fs_frag +
fs->fs_cstotal.cs_nffree + dbtofsb(fs, fs->fs_pendingblocks);
sbp->f_bavail = freespace(fs, fs->fs_minfree) +

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


CFT: TRIM Consolodation on UFS/FFS filesystems

2018-08-20 Thread Kirk McKusick
I have recently added TRIM consolodation support for the UFS/FFS
filesystem. This feature consolodates large numbers of TRIM commands
into a much smaller number of commands covering larger blocks of
disk space. Best described by the commit message:

  Author: mckusick
  Date: Sun Aug 19 16:56:42 2018
  New Revision: 338056
  URL: https://svnweb.freebsd.org/changeset/base/338056

  Log:
Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem.

When deleting files on filesystems that are stored on flash-memory
(solid-state) disk drives, the filesystem notifies the underlying
disk of the blocks that it is no longer using. The notification
allows the drive to avoid saving these blocks when it needs to
flash (zero out) one of its flash pages. These notifications of
no-longer-being-used blocks are referred to as TRIM notifications.
In FreeBSD these TRIM notifications are sent from the filesystem
to the drive using the BIO_DELETE command.

Until now, the filesystem would send a separate message to the drive
for each block of the file that was deleted. Each Gigabyte of file
size resulted in over 3000 TRIM messages being sent to the drive.
This burst of messages can overwhelm the drive's task queue causing
multiple second delays for read and write requests.

This implementation collects runs of contiguous blocks in the file
and then consolodates them into a single BIO_DELETE command to the
drive. The BIO_DELETE command describes the run of blocks as a
single large block being deleted. Each Gigabyte of file size can
result in as few as two BIO_DELETE commands and is typically less
than ten.  Though these larger BIO_DELETE commands take longer to
run, they do not clog the drive task queue, so read and write
commands can intersperse effectively with them.

Though this new feature has been throughly reviewed and tested, it
is being added disabled by default so as to minimize the possibility
of disrupting the upcoming 12.0 release. It can be enabled by running
``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it.
If no problems arise, we will consider requesting that it be enabled
by default for 12.0.

Reviewed by:  kib
Tested by:Peter Holm
Sponsored by: Netflix

This support is off by default, but I am hoping that I can get enough
testing to ensure that it (a) works, and (b) is helpful that it will
be reasonable to have it turned on by default in 12.0. The cutoff for
turning it on by default in 12.0 is September 19th. So I am requesting
your testing feedback in the near-term. Please let me know if you have
managed to use it successfully (or not) and also if it provided any
performance difference (good or bad).

To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1'
or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1.

Everything you need to test TRIM consolodation is obtained by setting
the above sysctl. However, if you want to collect statistics on how
effective the TRIM consolodation is working, the attached diff will
allow you to easily get statitics on how the TRIM is going. Compile your
kernel and the mount command. Note that if you do not do a buildworld,
you will need to copy /sys/sys/mount.h to /usr/include/sys/mount.h to
get the patched mount command to compile. Then run `mount -v'
(or `mount -v | grep /mnt' to get just the statistics for /mnt).

Removing a 30Mb file without TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, 
fsid d43f795b6a7d34fb, TRIM: total 952 total blocks 7616)

While removing the same file with TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, 
fsid d43f795b6a7d34fb, TRIM: total 3 total blocks 7616)

It also tracks pending blocks and pending files. These numbers are only
printed out when they are non-zero. Here is an example running with soft
updates right after a file has been rm'ed, but its blocks not yet released:
/dev/md0 on /mnt (ufs, local, soft-updates, writes: sync 2 async 251, reads: 
sync 5 async 0, fsid 303f795b1be0c459, pending blocks 7616, pending files 1)

Finally it tracks inflight BIO_DELETEs and total blocks represented by
those inflight BIO_DELETEs. These numbers are also only printed out when
they are non-zero. These statistics let you see how much of a backlog
of BIO_DELETEs you have backed up at/in the disk drive and you can track
how quickly they drain.

Kirk McKusick

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d

2018-02-23 Thread Kirk McKusick
> From: "Chris H" <bsd-li...@bsdforge.com>
> Reply-To: bsd-li...@bsdforge.com
> To: "FreeBSD Current" <freebsd-current@freebsd.org>
> Subject: kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d
> Date: Mon, 19 Feb 2018 14:18:15 -0800
> 
> I'm seeing a number of messages like the following:
> kernel: failed: cg 5, cgp: 0xd11ecd0d != bp: 0x63d3ff1d
> 
> and was wondering if it's anything to be concerned with, or whether
> fsck(8) is fixing them.
> This began to happen when the power went out on a new install:
> FreeBSD dns0 12.0-CURRENT FreeBSD 12.0-CURRENT #0: Wed Dec 13 06:07:59 PST 
> 2017
> root@dns0:/usr/obj/usr/src/amd64.amd64/sys/DNS0 amd64
> which hadn't yet been hooked up to the UPS.
> I performed an fsck in single user mode upon power-up. Which ended with the
> mount points being masked CLEAN. I was asked if I wanted to use the JOURNAL.
> I answered Y.
> FWIW the systems are UFS2 (ffs) have gpart labels, and were newfs'd thusly:
> newfs -U -j
> 
> Thank you for all your time, and consideration.
> 
> --Chris

This problem should have been fixed with this commit:

r328914 | mckusick | 2018-02-05 16:19:46 -0800 (Mon, 05 Feb 2018)

You need to update your kernel to get the fix.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VM images for 12.0-CURRENT showing checksum failed messages

2017-10-18 Thread Kirk McKusick
> Date: Wed, 18 Oct 2017 16:40:22 +
> From: Glen Barber <g...@freebsd.org>
> To: John Baldwin <j...@freebsd.org>
> Cc: freebsd-current@freebsd.org, David Boyd <david.boy...@twc.com>,
> "mckus...@mckusick.com" <mckus...@mckusick.com>
> Subject: Re: VM images for 12.0-CURRENT showing checksum failed messages
> 
> On Wed, Oct 18, 2017 at 09:28:40AM -0700, John Baldwin wrote:
>> On Wednesday, October 18, 2017 03:01:55 PM Glen Barber wrote:
>>> On Wed, Oct 18, 2017 at 07:49:00AM -0700, John Baldwin wrote:
>>>> On Tuesday, October 17, 2017 11:57:44 AM David Boyd wrote:
>>>>> The FreeBSD-12.0-CURRENT-amd64-20171012-r324542.vmdk image displays
>>>>> many checksum failed messages when booted. (see attachment).
>>>>>
>>>>> I think this started about 20170925.
>>>>>
>>>>> I have VirtualBox VM's running 10.4-STABLE, 11.1-STABLE and 12.0-
>>>>> CURRENT.
>>>>>
>>>>> Only the 12.0-CURRENT image exhibits this behavior.
>>>>>
>>>>> This is easily fixed by "fsck -y /" in single-user mode during the boot
>>>>> process.
>>>>>
>>>>> I can test any updates at almost any time.
>>>>
>>>> I wonder if the tool creating the snapshot images wasn't updated to
>>>> generate cg checksums when creating the initial filesystem.  Glen,
>>>> do you know which tool (makefs or something else?) is used to
>>>> generate the UFS filesystem in VM images for snapshots?
>>>> (In this case it appears to be a .vmdk image)
>>>>
>>>
>>> mkimg(1) is used.
>>
>> Does makefs generate the UFS image fed into mkimg or does mkimg generate the
>> UFS partition itself?
> 
> Sorry, I may have understated a bit.
> 
> First, mdconfig(8) is used to create a md(4)-backed disk, onto which
> newfs(8) is run, followed by the installworld/installkernel targets.
> 
> Next, mkimg(1) is used to feed the resultant md(4)-based .img
> filesystem (after umount(8)) to create the final output image.
> 
> Glen

Glen,

Can you try running fsck on the md(4) disk after you do the unmount to
see if it finds any problems (`fsck /dev/md0')? If that comes up clean
(as it should), then I can investigate what it is about mkimg that causes
problems. If fsck finds problems, then there is an issue in the base UFS
infrastructure.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: The futur of the roff toolchain

2017-05-23 Thread Kirk McKusick
Thanks for all your work on this project. As I still use roff for
our book and for many of my presentations, it is a topic of interest
to me. That said, I am fine with roff dropping out of base as I can
easily enough bring it in from ports. And I am curious to try using
heirloom doctools on our book to see if it works. We do some pretty
evil things with diversions, so I can easily believe that it will
not work. But it would be great if it does work, because the groff
in base has some bugs that are annoying to work around.

~Kirk
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Followup on packaging base with pkg(8)

2016-05-19 Thread Kirk McKusick
Glen,

I realize that you have put an enormous amount of effort into
getting the packaging of base with pkg(8) into the 11.0 release
and am sorry to hear that it needs to be delayed. But having
watched the mailing lists during these efforts I realize that
it is a much more difficult problem than it would at first
appear to be. Thank-you for your efforts to date and I look
forward to the transition (hopefully in the 11.1 release) as
I believe it will be a huge step forward.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic "ffs_checkblk: bad block" on recent -head kernels

2015-12-03 Thread Kirk McKusick
> Date: Thu, 3 Dec 2015 23:47:52 +0100
> From: Mateusz Guzik <mjgu...@gmail.com>
> To: Rick Macklem <rmack...@uoguelph.ca>
> Cc: FreeBSD Current <freebsd-current@freebsd.org>
> Subject: Re: panic "ffs_checkblk: bad block" on recent -head kernels
> 
> On Thu, Dec 03, 2015 at 05:08:27PM -0500, Rick Macklem wrote:
>> Hi,
>> 
>> I get a fairly reproducible panic when doing a full kernel build
>> on a 256Mbyte single core i386 when running recent kernels from -head.
>> 
>> The panic is "ffs_checkblk: bad block ..". I don't actually have the
>> block # (although I think it's just 0xfff, given the backtrace),
>> because it runs off the screen. (I looked up the message via the debugger
>> from the first arg. to panic.)
>> 
>> Here's the backtrace without all the numbers:
>> panic(c14f4b55, , , 0, 64,...)
>> ffs_checkblk(, 8000, fff9c, , c4a02454,...)
>> ffs_reallocblks
>> VOP_REALLOCBLKS_APV
>> cluster_write
>> ffs_write
>> VOP_WRITE_APV
>> vn_write
>> vn_io_fault_doio
>> vn_io_fault1
>> vn_io_fault
>> dofilewrite
>> kern_writev
>> sys_write
>> syscall
>> 
>> It doesn't happen on a kernel dated Sep. 30, but does happen on a Nov. 30 
>> one.
>> (I was away from home, so I didn't upgrade kernels for 2 months.)
>> 
>> I am slowly doing a binary search for the first kernel rev. where it occurs,
>> but since each build takes hours, it's going to take a while;-).
>> 
>> At this point, it doesn't appear to happen on r289278 (just before jeff@'s 
>> buffer
>> cache patch).
>> With kernels between r289279-->r290480, I get into the "R" state that
>> was fixed by r290481 before I get a crash.
>> I tried reverting r289405 and r290047 from a recent kernel and the crashes 
>> still
>> occurred, so it doesn't appear to be these commits.
>> 
>> I am currently testing r290481 to see if the crash occurs for this rev.
>> 
>> If anyone has some insight into which commit might cause this,
>> please let me know.
> 
> Well, did it crash with r291460 or later?
> 
> If so, try the kernel just before that and if that helps, try:
> 
> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
> index ff37de8..0ad6ef7 100644
> --- a/sys/kern/vfs_subr.c
> +++ b/sys/kern/vfs_subr.c
> @@ -2783,6 +2783,7 @@ _vdrop(struct vnode *vp, bool locked)
> vp->v_op = NULL;
>  #endif
> bzero(>v_un, sizeof(vp->v_un));
> +   vp->v_lasta = vp->v_clen = vp->v_cstart = vp->v_lastw = 0;
> vp->v_iflag = 0;
> vp->v_vflag = 0;
> bo->bo_flag = 0;
> 
> -- 
> Mateusz Guzik 

I concur with trying this suggestion. starting with r291460 these
fields were no longer zero'ed when allocating the vnode. So you may
have some residual values in there that are causing trouble.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel crash during heavy disk access

2013-07-10 Thread Kirk McKusick
 Date: Tue, 9 Jul 2013 18:29:01 -0700
 Subject: Re: Kernel crash during heavy disk access
 From: Adrian Chadd adr...@freebsd.org
 To: Benjamin Kaduk b...@freebsd.org, Jeff Roberson j...@freebsd.org,
 Kirk McKusick mckus...@mckusick.com
 Cc: Eric Camachat eric.camac...@gmail.com, curr...@freebsd.org
 
 Well, best to tell kirk and jeffr.
 
 Jeffr wrote the journaling stuff.
 
 .. but I thought they knew there's still problems?
 
 -adrian

Jeff has fixed all the journaling issues for which we have some way
of reproducing them. We do still have some reports that there are
problems but only a vague description and nothing that we can use
to reproduce them on our systems.

One of the inherit characteristics of any type of journaling is that
once it thinks that it has fixed something, it never goes back and
checks it again later. So, if there is some inconsistency that gets
into your filesystem through media error or an earlier journaling bug,
it will stay there and continue to plague you until a full fsck is
run to clean it up. So, if you are getting filesystem related crashes,
the first thing you should do is a full (fsck -f) check to make sure
that you are starting from a clean state. After that, if you find that
the journaling is not keeping it consistent, please send Jeff and me
a report of what you are doing, what problems it creates, and most
importantly transcript of a run of `fsck_ffs -d' first using the 
journal and then a second time with a full check (fsck_ffs -f -d)
so that we can try to analyse what is going wrong. Note that you
need to run fsck_ffs explicitly because the fsck front end will not
pass the -d (debug output) flag through to fsck_ffs.

Kirk McKusick

 On 9 July 2013 17:48, Benjamin Kaduk b...@freebsd.org wrote:
 On Tue, 9 Jul 2013, Adrian Chadd wrote:

 On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote:

 On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:

 Hi,

 Try doing a full, non-journal fsck.

 -adrian


 Thank you, it fixed the problem!
 Does it mean journal didn't work?


 Yup :(


 So, you are going to tell Kirk about it?

 -Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: A PRIV_* flag for /dev/mem?

2013-06-16 Thread Kirk McKusick
 Date: Sat, 15 Jun 2013 17:23:50 -0600
 From: Jamie Gritton ja...@freebsd.org
 To: FreeBSD Current freebsd-current@FreeBSD.org
 CC: Kirk McKusick mckus...@mckusick.com,
 Konstantin Belousov kostik...@gmail.com,
 Alexander Leidinger netch...@freebsd.org,
 Pawel Jakub Dawidek p...@freebsd.org,
 Robert Watson rwat...@freebsd.org
 Subject: Re: A PRIV_* flag for /dev/mem?
 
 On 05/20/13 16:56, Kirk McKusick wrote:
 I pointed Robert and Pawel at your discussion on creating a new
 PRIV_KMEM and adding a check for it in memopen(). I am of the opinion
 that this is a good idea, but I am hoping that one of Robert or Pawel
 will comment since they are much more active in this area.
 
 I suppose it's safe to say further comment isn't forthcoming. So with
 one vote for and one against (or at least questioning), I'll humbly
 leave it up to myself to be the tie-breaker :-).
 
 Here's a proposed patch. I separate kmem access into read and write, as
 I saw other similar splits in the priv list. Perhaps that's overkill,
 and I can use a single PRIV_KMEM instead of PRIV_KMEM_READ and
 PRIV_KMEM_WRITE.
 
 Perhaps this is an overreach, because PRIV_KMEM_READ is used where the
 default isn't root privilege: the file permission and expected usage are
 group kmem gets to read /dev/[k]mem. I'm not about to go hard-coding a
 gid into the kernel, so it seems the proper thing to do (not included in
 the patch) would be to allow PRIV_KMEM_READ by default. I thought there
 might already be such cases where the default is to allow, but no: this
 would be the first default-allow permission. So perhaps the best answer
 is not worry about that one, and only add PRIV_KMEM_WRITE (leaving reads
 controlled by file permission alone as they are now).
 
 - Jamie

With the change from the error noted by Kostik, I concur with your 
proposed change.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: A PRIV_* flag for /dev/mem?

2013-05-20 Thread Kirk McKusick
I pointed Robert and Pawel at your discussion on creating a new
PRIV_KMEM and adding a check for it in memopen(). I am of the opinion
that this is a good idea, but I am hoping that one of Robert or Pawel
will comment since they are much more active in this area.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [PATCH] Fix sbin/fsdb/fsdbutil.c for r247212

2013-03-02 Thread Kirk McKusick
 Date: Sun, 24 Feb 2013 22:41:21 +0300
 Subject: Re: [PATCH] Fix sbin/fsdb/fsdbutil.c for r247212
 From: Sergey Kandaurov pluk...@freebsd.org
 To: David Wolfskill da...@catwhisker.org
 Cc: curr...@freebsd.org, Kirk McKusick mckus...@freebsd.org
 
 On 24 February 2013 19:25, David Wolfskill da...@catwhisker.org wrote:
 On Sun, Feb 24, 2013 at 07:05:34AM -0800, David Wolfskill wrote:
 ...hine was:
 Simple patch attached; world is still building, but at least it got
 through the make dependencies phase this time.
 ...

 That was incomplete, as it didn't (also) address the change to
 getdatablk().

 The attached patch actually made it through buildworld.

 Note that it is entirely possible that I erred in specifying
 BT_UNKNOWN for the additional type argument.
 
 Hi David.
 
 Thank you for the proposed fix. I committed it with r247234.
 I'm not sure regarding BT_UNKNOWN value either. Well..  at least
 it should be not worse that it is now, and it should fix the build.
 I have not found any (regressive) changes in fsdb -d `blocks' output.
 
 -- 
 wbr,
 pluknet

Sorry, I am bad about keeping up with my mckus...@freebsd.org email.
I do need to watch it right after making commits. I also had no idea
that sbin/fsdb shared code with sbin/fsck_ffs. I really do need to
get back in the habit of buildworlds before doing any commits.

All that said, the changes that you have made are correct. The type
is only used for collecting statistics. So, if you do not know the
type, using DT_UNKNOWN is correct. If there is ever a desire to
collect type-of-I/O statistics in fsdb then that choice will need
to be revisited. But, I doubt that type-of-I/O statistics are ever
likely to be interesting in fsdb.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J on 9.1-RC2 ISO

2012-12-05 Thread Kirk McKusick
 Date: Sun, 04 Nov 2012 21:13:36 +0900 (JST)
 To: freebsd-sta...@freebsd.org
 Subject: Re: SU+J on 9.1-RC2 ISO
 From: HATANO Tomomi hata...@infolab.ne.jp
 Cc: j...@koitsu.org, b.smee...@ose.nl, fnwhiteh...@freebsd.org,
 freebsd-current@freebsd.org
 
 Hi all.
 
 The point is:
 
 There is completely no way to take a snapshot of SU+J partition
 unless modify one's kernel.
 
 Whether some issue still exist or not,
 how about enabling snapshoting SU+J partition
 through sysctl variable?
 
 Would you mind to see patch attached?
 
 1. Taking a snapshot of SU+J partition is controlled through sysctl variable.
 
 2. Default to disable.
One who want to enable it should set the variable manually.
 
 3. The default value in bsdinstall(8) may be left as is.
 --
 HATANO Tomomi.
 
 --- src/sys/ufs/ffs/ffs_snapshot.c.orig   2012-11-04 11:01:58.0 
 +0900
 +++ src/sys/ufs/ffs/ffs_snapshot.c2012-11-04 11:13:32.0 +0900
 @@ -182,8 +182,10 @@
   */
  int dopersistence = 0;
  
 -#ifdef DEBUG
  #include sys/sysctl.h
 +int snapsuj = 0;
 +SYSCTL_INT(_debug, OID_AUTO, snapsuj, CTLFLAG_RW, snapsuj, 0, );
 +#ifdef DEBUG
  SYSCTL_INT(_debug, OID_AUTO, dopersistence, CTLFLAG_RW, dopersistence, 0, 
 );
  static int snapdebug = 0;
  SYSCTL_INT(_debug, OID_AUTO, snapdebug, CTLFLAG_RW, snapdebug, 0, );
 @@ -230,7 +232,7 @@
* At the moment, journaled soft updates cannot support
* taking snapshots.
*/
 - if (MOUNTEDSUJ(mp)) {
 + if (MOUNTEDSUJ(mp)  (snapsuj == 0)) {
   vfs_mount_error(mp, %s: Snapshots are not yet supported when 
   running with journaled soft updates, fs-fs_fsmnt);
   return (EOPNOTSUPP);
 

Snapshots are disabled when using SU+J for a reason. That reason is
that the journal rollback when a snapshot is active on a filesystem
DOES NOT WORK. It leaves your filesystem with duplicate blocks that can
only be removed by manually running fsck and correcting the duplicate
block entries by hand. If you need to use snapshots, then run with SU
and not SU+J. When journal rollback properly handles snapshots, snapshots
on SU+J will be enabled.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: FS hang when creating snapshots on a UFS SU+J setup

2012-01-10 Thread Kirk McKusick
 Date: Mon, 9 Jan 2012 18:30:51 +0100
 From: Yamagi Burmeister li...@yamagi.org
 To: j...@freebsd.org, mckus...@freebsd.org
 Cc: freebsd-current@freebsd.org, br...@bryce.net
 Subject: Re: FS hang when creating snapshots on a UFS SU+J setup
 
 Hello,
 
 I'm sorry to bother you, but you may not be aware of this thread and
 this problem. We are several people experiencing deadlocks, kernel
 panics and other problems when creating sanpshots on file systems
 with SU+J. It would be nice to get some feedback, e.g. how can we
 help debugging and / or fixing this problem.
 
 Thank you,
 Yamagi

First step in debugging is to find out if the problem is SU+J
specific. To find out, turn off SU+J but leave SU. This change
is done by running:

umount filesystem
tunefs -j disable filesystem
mount filesystem
cd filesystem
rm .sujournal

You may want to run `fsck -f' on the filesystem while you have
it unmounted just to be sure that it is clean. Then run your
snapshot request to see if it still fails. If it works, then
we have narrowed the problem down to something related to SU+J.
If it fails then we have a broader issue to deal with.

If you wish to go back to using SU+J after the test, you can
reenable SU+J by running:

umount filesystem
tunefs -j enable filesystem
mount filesystem

When responding to me, it is best to use my mckus...@mckusick.com
email as I tend to read it more regularly.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: dogfooding over in clusteradm land

2011-12-28 Thread Kirk McKusick
Rather than changing BKVASIZE, I would try running the cvs2svn
conversion on a 16K/2K filesystem and see if that sorts out the
problem. If it does, it tells us that doubling the main block
size and reducing the number of buffers by half is the problem.
If that is the problem, then we will have to increase the KVM
allocated to the buffer cache.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: ffs_blkfree_cg: freeing free block

2011-10-29 Thread Kirk McKusick
 Date: Fri, 28 Oct 2011 11:16:59 +0200
 From: deeptec...@gmail.com deeptec...@gmail.com
 To: freebsd-current@freebsd.org
 Subject: panic: ffs_blkfree_cg: freeing free block
 
 A panic occured while I was ``rm -rf''ing a large filedirectory tree
 (that I just created with untar) on an old drive that I have not used
 for a long time. Unfortunately I'm not 100% sure that the filesystem
 was clean when I mounted it today. Could that result in such a panic?
 
 I don't have the intermediate object files for the kernel; now I'm
 building the kernel again (from the appropriate, exact sources). That
 shouldn't harm debugging, should it? Meanwhile, I'll take any debug
 info requests, which I'll attempt to address shortly.

This panic happens when the free-block bitmap is corrupted. That can
happen due to:

1) An unclean filesystem being mounted (though you should get a warning
   when you attempt to do this).

2) Bit-rot on the disk that is not checked for before mounting. This is
   typically only an issue for a disk that has been offline for a long time.

3) Write errors to the disk.

There have been no changes to the code that manage the filesystem bitmaps
in decades (nearly three decades), so a software cause of this panic is
unlikely to have been recently introduced. Hence, I would not spend a lot
of time trying to get a backtrace, etc.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Kirk McKusick
 Date: Tue, 3 May 2011 22:40:26 -0700
 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
  partition when filesystem full
 From: Garrett Cooper yaneg...@gmail.com
 To: Jeff Roberson j...@freebsd.org,
 Marshall Kirk McKusick mckus...@mckusick.com
 Cc: FreeBSD Current freebsd-current@freebsd.org
 
 Hi Jeff and Dr. McKusick,
 Ran into this panic when /usr ran out of space doing a make
 universe on amd64/r221219 (it took ~15 minutes for the panic to occur
 after the filesystem ran out of space -- wasn't quite sure what it was
 doing at the time):
 
 ...
 
 Let me know what other commands you would like for me to run in kgdb.
 Thanks,
 -Garrett

You did not indicate whether you are running an 8.X system or a 9-current
system. It would be helpful to know that.

Jeff thinks that there may be a potential race in the locking code for
softdep_request_cleanup. If so, this patch for 9-current should fix it:

Index: ffs_softdep.c
===
--- ffs_softdep.c   (revision 221385)
+++ ffs_softdep.c   (working copy)
@@ -11380,7 +11380,8 @@
continue;
}
MNT_IUNLOCK(mp);
-   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
+   if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
+   curthread)) {
MNT_ILOCK(mp);
continue;
}

If you are running an 8.X system, hopefully you will be able to apply it.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: How a full fsck screwed up my SU+J filesystem

2010-12-01 Thread Kirk McKusick
 Date: Wed, 1 Dec 2010 16:27:48 +0200
 From: Kostik Belousov kostik...@gmail.com
 To: Peter Holm p...@freebsd.org
 Cc: Garrett Cooper yaneg...@gmail.com,
 Marshall Kirk McKusick mckus...@mckusick.com, curr...@freebsd.org
 Subject: Re: How a full fsck screwed up my SU+J filesystem
 
 On Wed, Dec 01, 2010 at 12:00:08PM +0100, Peter Holm wrote:
  On Wed, Dec 01, 2010 at 01:28:06AM -0800, Garrett Cooper wrote:
  
   So... I was doing a portmaster -af today because vlc stopped playing
   audio (for some reason ... I kind of went on a pkg_cutleaves rampage
   and probably deinstalled too much stuff), and the machine hardlocked
   during an upgrade. I did a soft reboot and saw messages along the
   lines of your journal and filesystem mount time mismatched; running
   a full fsck. I figured ok, sure... and let it do it's thing.
   Problem was that it pruned a lot of stuff from my /usr partition --
   including the .sujournal !!! So now it's stuck at Mounting local
   file systems: stating:
   
   Failed to find journal.   Use tunefs to create one
   Failed to start journal: 2
   
   (I assume the 2 means ENOENT). All of the above were printf(9)'s
   from the kernel.
   
   Now the machine won't continue in multiuser mode (doesn't respond
   to interrupts, no panic, etc). Going into ddb, I don't see anything
   in info_threads (just a bunch of references to sched_switch, a few
   to fork_trampoline, cpustop_handler, and kdb_enter). I'm going to
   try and massage the machine back to life from single user mode, but
   the fact that this died in this way (i.e. .sujournal getting nuked
   by a full fsck) is a bit disheartening for SU+J :(... It would be
   nice if at least the fsck aborted before going and nuking the
   journal :/... (or at the very least if the file wasn't removable --
   i.e. SF_NOUNLINK).
   
   Here's to hoping I can resuscitate the filesystem...
   
   Thanks,
   -Garrett
 
  Thank you for reporting this.
 
  I was able to reproduce the problem by:
 
  tunefs -j enable /dev/md5a
  mount /dev/md5a /mnt
  chflags 0 /mnt/.sujournal
  rm -f /mnt/.sujournal
  umount /mnt
  mount /dev/md5a /mnt
 
  The mount(1) is now stuck in mntref.
 
  http://people.freebsd.org/~pho/stress/log/kostik404.txt
 
  A sequence of tunefs -j disable + tunefs -j enable should get
  you going.
 
 The action is of the category do not do it then for sure.
 
 The problem in kostik404 is due to ffs_mount() did not cleaned up
 the vnodes instantiated during the mount. Activating softdep journal
 instantiates at least root vnode, and a journal vnode, if found. The
 following patch fixed it for me.
 
 diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c
 index 94951e4..72f40da 100644
 --- a/sys/ufs/ffs/ffs_vfsops.c
 +++ b/sys/ufs/ffs/ffs_vfsops.c
 @@ -928,6 +928,7 @@ ffs_mountfs(devvp, mp, td)
   if ((fs-fs_flags  FS_DOSOFTDEP) 
   (error =3D softdep_mount(devvp, mp, fs, cred)) !=3D 0) {
   free(fs-fs_csp, M_UFSMNT);
 + ffs_flushfiles(mp, FORCECLOSE, td);
   goto out;
   }
   if (fs-fs_snapinum[0] !=3D 0)
 

Thanks all: Garrett for the report, Peter for the way to reproduce
the problem, and Kostik for a fix. I have copied Jeff so that he can
confirm that Kostik's fix is the appropriate thing to do. And I will
take a look at fsck to see if I can make it a bit more paranoid about
removing .sujournal.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


HEADS UP: 64-bit quotas going in to head today

2010-05-06 Thread Kirk McKusick
Dag-Erling Smørgrav and I have been working on updating the
FFS quota system to support both traditional 32-bit and new 64-bit
quotas (for those of you who want to put 2+Tb quotas on your users).

By default quotas are not compiled into the kernel. To include them
in your kernel configuration you need to specify:

options QUOTA   # Enable FFS quotas

If you are already running with the current 32-bit quotas, they
should continue to work just as they have in the past. If you
wish to convert to using 64-bit quotas, use `quotacheck -c 64';
if you wish to revert from 64-bit quotas back to 32-bit quotas,
use `quotacheck -c 32'.

There is a new library of functions to simplify the use of the
quota system, do `man quotafile' for details. If your application
is currently using the quotactl(2), it is highly recommended that
you convert your application to use the quotafile interface.
Note that existing binaries will continue to work.

The new quota system has been heavily tested, however wider use
inevitably finds new issues. If you encounter any problems with
quotas please email me directly as well as posting on current as
I all too often miss list email and emailing me directly will
ensure the quickest response.

Special thanks to John Kozubik of rsync.net for getting me
interested in pursuing 64-bit quota support and for funding
part of my development time on this project.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: -current lockup (how to diagnose?)

2003-12-03 Thread Kirk McKusick
I am guessing that some of the recent locking changes are causing
the problem.  Unfortunately I am on the road now through Jan 4th,
so will not be in a position to look at it. Hopefully one of the
folks working on getting the SMP pushed down through the filesystem
(Jeff Roberson, John Baldwin, or Alan Cox) will have some idea what
broke recently. I would try looking at which process holds the
buffer lock that the find is trying to get. You can usually unravel
the chain of locks to eventually find what pair of events lead to
the deadlock. It definitely helps to have DEBUG_LOCKS compiled into
your kernel.

Kirk McKusick
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: initiate_write_inodeblock_ufs2: already started

2003-12-03 Thread Kirk McKusick
This error happens if things are not properly locked. As per my
previous message, I am not able to look at it now, but am hoping
that resolving some of the other races will solve this as well.

Kirk McKusick
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS-UP new statfs structure

2003-11-14 Thread Kirk McKusick
 Date: Fri, 14 Nov 2003 08:33:06 +
 From: Matt Smith [EMAIL PROTECTED]
 To: Marco Wertejuk [EMAIL PROTECTED]
 Cc: Kirk McKusick [EMAIL PROTECTED], [EMAIL PROTECTED]
 Subject: Re: HEADS-UP new statfs structure
 X-ASK-Info: Whitelist match
 
 Marco Wertejuk wrote:
  Just for a short note: cfsd (ports/security/cfs) should be 
  recompiled as well after those statfs changes.
  
 
 And mail/postfix and devel/gnomevfs2 (ones's i've found so far)
 
 postfix did this every time it received a mail until I recompiled it:
 
 pid 4049 (smtpd), uid 1003: exited on signal 11
 
 And gnomevfs was something I saw in another headsup. There are bound to 
 be others, I'm just keeping an eye on my /var/log/messages to see if 
 anything else sig 11 or 12's! So far so good though.
 
 Matt.

This is why we make this change now so that it will be in place
for the masses when 5.2 is released :-)

Kirk McKusick
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


HEADS-UP new statfs structure

2003-11-12 Thread Kirk McKusick
The statfs structure was updated on Nov 11th with 64-bit fields
to allow accurate reporting of multi-terabyte filesystem sizes.

You should build and boot a new kernel BEFORE doing a `make world'
as the new kernel will know about binaries using the old statfs
structure, but an old kernel will not know about the new system
calls that support the new statfs structure. Running an old kernel
after a `make world' will cause programs such as `df' that do a
statfs system call to fail with a bad system call.

Kirk McKusick
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?)

2003-11-03 Thread Kirk McKusick
 To: Kirk McKusick [EMAIL PROTECTED]
 From: Brian F. Feldman [EMAIL PROTECTED]
 Date: Thu, 23 Oct 2003 15:46:53 -0400
 Cc: [EMAIL PROTECTED]
 Subject: Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?) 
 
 Kirk McKusick [EMAIL PROTECTED] wrote:
  I have been able to reproduce your hang on my system and your suggested
  fix does prevent it. I am going to run some more buffer starvation-type
  tests on it this week and if they do not cause other problems, I will
  put in your suggested fix.
 
 Thanks, Kirk; seems everyone who's been able to reproduce it can't do so 
 anymore when the synchers are disallowed from waiting on runningbufspace 
 (a couple extra people testing it that haven't spoken up on the list).
 
 -- 
 Brian Fundakowski Feldman   \'[ FreeBSD ]''\
[EMAIL PROTECTED]   \  The Power to Serve! \
  Opinions expressed are my own.   \,,\

I have put in your suggested patch to avoid the runningbufspace
related lock-ups with md(4)/UFS/SU.

Kirk McKusick

=-=-=-=-=-=

From: Kirk McKusick [EMAIL PROTECTED]
Date: Mon, 3 Nov 2003 22:30:01 -0800 (PST)
To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: cvs commit: src/sys/kern vfs_bio.c
X-FreeBSD-CVS-Branch: HEAD
Sender: [EMAIL PROTECTED]

mckusick2003/11/03 22:30:01 PST

  FreeBSD src repository

  Modified files:
sys/kern vfs_bio.c 
  Log:
  Allow the bufdaemon and update daemon processes to skip the
  waitrunningbufspace() calls so that they are always able to
  proceed and clean up buffer space.
  
  Submitted by:   Brian Fundakowski Feldman [EMAIL PROTECTED]
  
  Revision  ChangesPath
  1.420 +9 -5  src/sys/kern/vfs_bio.c
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?)

2003-10-23 Thread Kirk McKusick
I have been able to reproduce your hang on my system and your suggested
fix does prevent it. I am going to run some more buffer starvation-type
tests on it this week and if they do not cause other problems, I will
put in your suggested fix.

Kirk McKusick

=-=-=-=-=-=

To: [EMAIL PROTECTED]
From: Brian Fundakowski Feldman [EMAIL PROTECTED]
Mime-Version: 1.0
Date: Thu, 16 Oct 2003 15:32:58 -0400
Cc: [EMAIL PROTECTED]
Subject: runningbufspace related lock-ups with md(4)/UFS/SU (PATCH ?)

I'm having problems where the entire system is locking up when using a MD 
UFS+SoftUpdates partition.  I can simply dd if=/dev/zero of=/mnt/foo and in 
a couple tries it will lock up.  When it locks up, buf_daemon (or if that is 
patched against, syncer) is calling waitrunningbufspace() from a non-B_ASYNC 
buf call.  Because of this, the md(4) (md0) thread is stuck in ufs 
waiting to receive a lock on the vnode that one of the syncer/flusher 
daemons has locked, waiting for bufspace to run down.  The user program 
causing the problem is still stuck in wdrain because it's also waiting for 
waitrunningbufspace() to return.  In short, everything wants to try to 
reduce the amount of outstanding buffer space, but nothing moves forward 
while GEOM/md(4)/what have you are waiting for the daemons to let go of the 
vnode so they can write out data.
Does this scenario make sense?  I have fixed it here using the following 
very simple patch, which disables the implicit waitrunningbufspace() calls
so the daemons can't get stuck there.

diff -r1.412 vfs_bio.c
73a74,75
 static struct proc *bufdaemonproc;

889c891,893
   waitrunningbufspace();
---
   if (curthread-td_proc != bufdaemonproc 
   curthread-td_proc != updateproc)
   waitrunningbufspace();
2038,2039d2041

 static struct proc *bufdaemonproc;

-- 
Brian Fundakowski Feldman   \'[ FreeBSD ]''\
   [EMAIL PROTECTED]   \  The Power to Serve! \
 Opinions expressed are my own.   \,,\
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: file system (UFS2) consistancy after -current crash? (fwd)

2003-10-03 Thread Kirk McKusick
Date: Fri, 03 Oct 2003 05:03:34 -0600
From: Aaron Wohl [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: file system (UFS2) consistancy after -current crash?

After crashes recently ive been geting softupdate inconsistancies.
Directories in which a file has recently been renamed have neither
the old file nor the new file.  fsck -y recovers the inode and drops
it in lost in found.

I was under the impression that atomic rename() synced all the way
to the disk before returning?

Does softupdate enabled/disable have any bearing on this?

The disks themselfs are a raid5 on an adaptec 5400s.  We have had
some problems recently with aac (the 5400s driver) related crashes
we have been working with Scott Long on.  I was wondering if maybe
rename is only syncing as far as the raid controller memory?

The problem that we have been having with many of the RAID
systems is that they give an I/O completion interrupt after
they copy the change into their memeory, but before the I/O
is completed to the disk. Since the filesystem uses the I/O
completion interrupt as an indication that the change is on
disk, it proceeds to the next step. If the RAID ultimately
fails to get the data to the disk, inconsistencies arise.
This problem can arise whether or not soft updates are being
used, but because soft updates makes individual changes over 
a longer time period (potentially up to a minute rather than
the few milliseconds of 2-3 synchronous writes), it is more
likely to be apparent after a crash. None of this helped by
a journalling filesystem as the RAID lies about writing the
log so you may not have it available to do a rollback after
a crash. As we discovered with IDE disks, disabling the write
cache enable feature causes a massive performance hit, so in
practice that does not seem like a viable strategy. What does
work is to use tag-queueing. Unfortunately tag-queueing is
found primarily in SCSI systems, though it is starting to
show up in the high-end IDE disks.

Kirk McKusick
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


mksnap_ffs, snapshot issues, again

2003-08-23 Thread Kirk McKusick
Robert Watson forwarded your posting to me as I am not as current
on current as I should be.

-- Forwarded message --
 Date: Mon, 18 Aug 2003 22:38:47 +0200
 From: [iso-8859-2] Branko F. Graènar [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: mksnap_ffs, snapshot issues, again
 
 I have 900G array on a promise sx6000 controller
 
 This is freshly formatted filesystem (newfs -L export -O 2 -U -g 48000 -i 2048 -m 0 
 -o space /dev/pst0s2d)
 
 # df -i /export
 /dev/pst0s2d 778742004 216194 778525810   0%   2 4451592920%   /export
 
 # mount | grep export
 /dev/pst0s2d on /export (ufs, local, soft-updates)
 
 let's try to create a snapshot of empty filesystem
 
 # cd /export
 # mksnap_ffs /export aaa.snap
 
 ... after 30 minutes ... snapshot was not created (!!! On a empty
 filesystem !!!)... Ok, long snapshot creation would be fine if it
 would not hang all processes, which would like to do something on
 /export (ls /export for example.). Filesystem cannot be unmounted.
 mksnap_ffs process cannot be killed. Reboot and foreground fsck
 helps.

 This is 5.1-RELEASE (without patches, with custom kernel - just picked up generic 
 kernel and removed uneeded stuff.)
 
 Any ideas, why is this happening? As i mentioned before, this prevents background 
 fsck to make his job done (machine hangs.)
 
 
 I would really like to solve this issue
 
 Brane

 Discussion -

Paul Saab kindly arranged a machine (tank.freebsd.org) with a 2Tb
disk array on it for me to test. I enclose a copy of the `sysctl kern'
output at the end of this message. I first ran my own test which
involved creating a default configuration filesystem, taking a
snapshot, and removing the snapshot. The scripted result is below.
It shows that it takes 48 minutes to create the snapshot and 15
minutes to remove it. But importantly, it shows that the filesystem
is only locked down and inaccessible for 0.042 seconds of that 48
minutes. The problem is that the 77,000 indirect blocks needed by
the snapshot do not fit in the 300 kernel buffers allotted to it.
So, every indirect block needs to be read and written approximately
three times. Just to be sure that there was not something weird about
your configuration, I also ran the same set of tests using your
newfs parameters. Other than creating more cylinder groups the
result (e.g., running time) was about the same.

But, to get to the problem that you are having with accessing your
filesystem. The problem is that although the filesystem is only
locked briefly, the snapshot file is locked for the entire 48 minutes.
Thus, if you touch the snapshot file (by for example doing a stat
on it), then the process doing the stat will hang for 48 minutes.
The next process to try and touch the snapshot will lock /export
while it waits for the lock on the snapshot to clear. And at that
point you are hosed for 48 minutes on all access to /export :-(
So, I think that the best solution for you would be to try creating
a hidden directory for the snapshot file, e.g., create a /export/.snap
directory mode 700 owned by root, then create the snapshot as say
/export/.snap/snap1. This way, it will be out of the way of all
snoopy programs except those walking the filetree as root.

Kirk McKusick

 Results of my test -

Script started on Fri Aug 22 17:18:34 2003

tank# newfs /dev/twed0
/dev/twed0: 2097152.0MB (4294967292 sectors) block size 16384, fragment size 2048
using 11413 cylinder groups of 183.77MB, 11761 blks, 23552 inodes.
super-block backups (for fsck -b #) at:
 160, 376512, 752864, 1129216, 1505568, 1881920, 2258272, 2634624, 3010976,
 3387328, 3763680, 4140032, 4516384, 4892736, 5269088, 5645440, 6021792,
 6398144, 6774496, 7150848, 7527200, 7903552, 8279904, 8656256, 9032608,
 9408960, 9785312, 10161664, 10538016, 10914368, 11290720, 11667072, 12043424,
 12419776, 12796128, 13172480, 13548832, 13925184, 14301536, 14677888,
 15054240, 15430592, 15806944, 16183296, 16559648, 16936000, 17312352,
 17688704, 18065056, 18441408, 18817760, 19194112, 19570464, 19946816,
 20323168, 20699520, 21075872, 21452224, 21828576, 22204928, 22581280,

  etc, etc, etc 

 4283638624, 4284014976, 4284391328, 4284767680, 4285144032, 4285520384,
 4285896736, 4286273088, 4286649440, 4287025792, 4287402144, 4287778496,
 4288154848, 4288531200, 4288907552, 4289283904, 4289660256, 4290036608,
 4290412960, 4290789312, 4291165664, 4291542016, 4291918368, 4292294720,
 4292671072, 4293047424, 4293423776, 4293800128, 4294176480, 4294552832,
 4294929184

tank# dumpfs /dev/twed0 | head -22
magic   19540119 (UFS2) timeSat Aug 23 01:18:55 2003
superblock location 65536   id  [ 3f47236f d612c37d ]
ncg 11413   size1073741823  blocks  1039959213
bsize   16384   shift   14  mask0xc000
fsize   2048shift   11  mask0xf800
frag8   shift   3   fsbtodb 2
minfree 8%  optim   timesymlinklen 120

Re: mksnap_ffs, snapshot issues, again

2003-08-23 Thread Kirk McKusick
To: Kirk McKusick [EMAIL PROTECTED]
cc: [iso-8859-2] Branko F. Graènar [EMAIL PROTECTED],
Paul Saab [EMAIL PROTECTED],
Robert Watson [EMAIL PROTECTED],
[EMAIL PROTECTED]
Subject: Re: mksnap_ffs, snapshot issues, again 
From: Poul-Henning Kamp [EMAIL PROTECTED]
In-Reply-To: Your message of Sat, 23 Aug 2003 01:32:38 PDT.
Date: Sat, 23 Aug 2003 11:01:28 +0200
X-ASK-Info: Whitelist match

In message [EMAIL PROTECTED],
Kirk McKusick writes:

But, to get to the problem that you are having with accessing your
filesystem. The problem is that although the filesystem is only
locked briefly, the snapshot file is locked for the entire 48 minutes.
Thus, if you touch the snapshot file (by for example doing a stat
on it), then the process doing the stat will hang for 48 minutes.

Isn't there some way we can loosen this aspect up ?

Either by having stat know about it and return approximate info or
simply by failing ? (I pressume that making the sleep interruptible
would break all sorts of standards)

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe

The race to the root problem in general could be largely solved
by changing lookup (VOP_LOOKUP really) to release the lock that
it holds on the directory before blocking on the next component
in the case where it is doing a lookup without intent to create.
If we did this, then a single locked node would have lookups
pile up on itself, but could not cascade to the root. A related
change would be to do an interruptable locking request on the
node so that if one did an `ls -l foo' where foo was say a
locked snapshot, it would be possible to interrupt it.

~Kirk
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 4.8-RC / 5-CURRENT UFS1 interoperability problem

2003-03-16 Thread Kirk McKusick
Date: Thu, 6 Mar 2003 17:21:00 +0300 (MSK)
From: Maxim Konovalov [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: 4.8-RC / 5-CURRENT UFS1 interoperability problem

Hello,

In short, there is a problem using the same UFS1 filesystem under
-stable and -current. Please look at an attached typescript for
details.

I noticed a wrong superblock information either:

[EMAIL PROTECTED] ~]$ df /spare
Filesystem  1K-blocks UsedAvail Capacity  Mounted on
/dev/ad0s2a  22520288 -125476 20844144-1%/spare

Is it known bug?

-- 
Maxim Konovalov, [EMAIL PROTECTED], [EMAIL PROTECTED]

Executive summary: you need to run `fsck -f -p' whenever you switch
to or from a 4.X (stable) and a 5.X (current) kernel. The reason
is that the UFS1 superblock summary information is maintained in
different parts of the superblock on these two systems. Neither
system maintains the summary information used by the other. There
is no risk of trashing your filesystem if you fail to run the fsck,
but the information reported by `df' will be wrong until you run
the fsck.

Kirk McKusick

=-=-=-=-=-=-=

golf# uname -a
FreeBSD golf.macomnet.net 4.8-PRERELEASE FreeBSD 4.8-PRERELEASE #19: Thu Feb 27 
13:33:49 GMT 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386
golf# fsck /dev/ad0s2a
** /dev/ad0s2a
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
3 files, 3 used, 5630069 free (21 frags, 703756 blocks, 0.0% fragmentation)
golf# mount /dev/ad0s2a /mnt
golf# mount | grep mnt
/dev/ad0s2a on /mnt (ufs, local, soft-updates)
golf# exit
exit

- clean reboot

golf# uname -a
FreeBSD golf.macomnet.net 5.0-CURRENT FreeBSD 5.0-CURRENT #6: Wed Feb 19 10:01:22 MSK 
2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GOLF5  i386
golf# fsck /dev/ad0s2a
** /dev/ad0s2a
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? [yn] y

SUMMARY BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? [yn] y

94155 files, 220230 used, 5409842 free (15210 frags, 674329 blocks, 0.3% fragmentation)

* FILE SYSTEM WAS MODIFIED *
golf# mount /dev/ad0s2a /mnt
golf# mount | grep mnt
/dev/ad0s2a on /mnt (ufs, local, nodev, noexec, nosuid, soft-updates)
golf# exit

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: BOOT2_UFS=UFS1_ONLY works for today's current

2003-02-23 Thread Kirk McKusick
From: David Syphers [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Subject: Re: BOOT2_UFS=UFS1_ONLY works for today's current
Date: Sun, 23 Feb 2003 14:49:52 -0600
Cc: [EMAIL PROTECTED]

On Sunday 23 February 2003 11:10 am, Richard Arends wrote:
 On Sun, 23 Feb 2003, David Syphers wrote:
  I added BOOT2_UFS=UFS2_ONLY to my make.conf, and my buildworld still
  dies in boot2. I'm trying to upgrade from a Feb. 19 -current
  (because it's crashing all the time, and I need to enable debugging
  stuff). Is there a fix, or would other information be helpful?

 Same problem over here. I reverted back the last commit on
 /usr/src/sys/ufs/ffs/fs.h in my source tree and that fixed the
 build. Of course, this is a workaround !!

Okay, I've verified that the problem is due to rev. 1.39 of
/usr/src/sys/ufs/ffs/fs.h. Peter Wemm pointed out that the problem
is not the commit, but gcc's bad handling of 64-bit operations.
Nonetheless, this commit does break world for a lot of people...
is there some official solution? The make.conf line only works for
UFS1 - if it's set to UFS2, buildworld still fails. (Am I correct
in assuming a 5.0-R install defaults to UFS2?)

-David

-- 
http://www.seektruth.org

Astronomy and Astrophysics Center
The University of Chicago

I have committed the following fix which reverts to using the
previous broken version of cgbase in ufsread.c. It will work fine
provided that your filesystem is smaller than 1.5Tb.

Kirk McKusick

Index: ufsread.c
===
RCS file: /usr/ncvs/src/sys/boot/common/ufsread.c,v
retrieving revision 1.9
diff -c -r1.9 ufsread.c
*** ufsread.c   2002/12/14 19:39:44 1.9
--- ufsread.c   2003/02/24 04:44:50
***
*** 28,33 
--- 28,35 
  
  #include ufs/ufs/dinode.h
  #include ufs/ffs/fs.h
+ #undef cgbase
+ #define cgbase(fs, c)   ((ufs2_daddr_t)((fs)-fs_fpg * (c)))
  
  /*
   * We use 4k `virtual' blocks for filesystem data, whatever the actual

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: Reboot(8) when fsck_ufs is running ?

2003-02-20 Thread Kirk McKusick
Date: Sat, 15 Feb 2003 00:50:01 +0100 (CET)
From: Martin Blapp [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: Kirk McKusick [EMAIL PROTECTED]
Subject: Reboot(8) when fsck_ufs is running ?

Hi all,

I don't know what the behaviour should be, but when I try
to reboot a box which has fsck_ufs is running, it doesn't
reboot and I have to powercycle it. Looks also like it
just hangs.

Do you experience the same at your side ? Shouln't we
abort the fsck_ufs and reboot ?

Martin

Assuming that you are running fsck_ufs as part of a background
fsck, the problem is probably that the fsck_ufs is in the midst
of creating a snapshot. At the moment, snapshot creation is not
interruptable, so the reboot is waiting for it to finish. I am
presently investigating a bug which causes snapshots of filesystems
bigger than about 250Gb to hang the kernel due to buffer starvation.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: strange dump/restore behaviour

2003-02-19 Thread Kirk McKusick
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: strange dump/restore behaviour
From: Dag-Erling Smorgrav [EMAIL PROTECTED]
Date: Thu, 09 Jan 2003 16:41:10 +0100

This happened while copying data over to a new disk (mounted on /mnt
and /mnt/usr; the original disk has only one partition).  The machine
was in single-user mode, but / was mounted read-write due to restore's
insistance on placing temporary files in /tmp (I found out later that
it respects TMPDIR, though the man page doesn't mention it).

root@dsa /mnt# dump -0Laf- / | restore -rf-
  DUMP: Date of this level 0 dump: Thu Jan  9 16:11:42 2003
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/da0a (/) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 1838856 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
warning: ./usr: File exists
expected next file 4, got 3
[...]

I can imagine that the file that caused the warning message was one of
restore's temporary files, but a) I've never seen this before, and b)
isn't -L supposed to prevent just that?

DES
-- 
Dag-Erling Smorgrav - [EMAIL PROTECTED]

Sorry for the slow response. I tend to get behind on my freebsd.org
email.

The warning comes about because you had already created /mnt/usr.
Since you were doing a full restore, you are getting a warning
that the usr directory already exists when restore tries to create
it. It complains again about finding an already existing inode (3
which was presumably the usr directory in the original dump).
Neither of these are problematic or affected your restore.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: INVARIANTS-related fs panic on alpha

2003-02-14 Thread Kirk McKusick
I have tried running my test machine out of filesystem space
(repeatedly) and have not been able to get this panic. I will
keep running that test in the hopes that it will show up. In
the meantime, if you can come up with an example that reliably
triggers it, that would be most helpful.

Kirk McKusick

=-=-=-=-=-=

Date: Fri, 14 Feb 2003 15:54:13 -0800
From: Kris Kennaway [EMAIL PROTECTED]
To: Kris Kennaway [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
   [EMAIL PROTECTED]
Subject: Re: INVARIANTS-related fs panic on alpha

On Sat, Jan 25, 2003 at 12:12:34AM -0800, Kris Kennaway wrote:
 One of the alpha package clients panicked with this.  It was under
 very high load at the time (25 simultaneous package builds):
=20
 fatal kernel trap:
=20
 trap entry =3D 0x2 (memory management fault)
 faulting va=3D 0xdeadc0dedeadc0e6
 type   =3D access violation
 cause  =3D store instruction
 pc =3D 0xfc53453c
 ra =3D 0xfc53b2a8
 sp =3D 0xfe001da15b30
 curthread  =3D 0xfc003e33b930
 pid =3D 3, comm =3D g_up
=20
 Stopped at  add_to_worklist+0xac:   stq a0,0x8(t0) 0xdeadc0dedea=
dc0e6 a0=3D0xfc0035deb200,t0=3D0xdeadc0dedeadc0de
 db trace
 add_to_worklist() at add_to_worklist+0xac
 handle_written_inodeblock() at handle_written_inodeblock+0x5e8
 softdep_disk_write_complete() at softdep_disk_write_complete+0xac
 bufdone() at bufdone+0x19c
 bufdonebio() at bufdonebio+0x1c
 biodone() at biodone+0x28
 g_dev_done() at g_dev_done+0xd8
 biodone() at biodone+0x28
 g_io_schedule_up() at g_io_schedule_up+0x4c
 g_up_procbody() at g_up_procbody+0x9c
 fork_exit() at fork_exit+0x100
 exception_return() at exception_return
 --- root of call graph ---
 db

I'm still getting this (on i386 and alpha).  I believe it is related
to a filesystem becoming full.  Can someone please investigate?

Kris

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: dump -L and privilege

2003-01-31 Thread Kirk McKusick
From: Jun Kuriyama [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: dump -L and privilege
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Is this enough?

-r-sr-x---  1 root  operator  5750 Jan 31 22:13 mksnap_ffs

o Should use filesystem device name rather than mountpoint?
o Should use the group of device rather than operator?


-- 
Jun Kuriyama [EMAIL PROTECTED] // IMG SRC, Inc.
 [EMAIL PROTECTED] // FreeBSD Project

The mount command needs the mountpoint, not the device name, so
the device name would only be needed if we want to use the group
of the device rather than operator. I argue that we should use
operator rather than the group of the device because the purpose
of this command is to allow the dump program, run by people in 
the operator group, to take snapshots.

At any rate, I have cleaned up the program and provided a Makefile 
and manual page (see below). The only semantic change that I made
to your program was to do the `chown' before doing the `chmod' so
as not to open a brief hole that would allow members of the default
(wheel) group to get read access to the snapshot.

Kirk McKusick

# This is a shell archive.  Save it in a file, remove anything before
# this line, and then unpack it by entering sh file.  Note, it may
# create directories; files and directories will be owned by you and
# have default permissions.
#
# This archive contains:
#
#   mksnap_ffs/Makefile
#   mksnap_ffs/mksnap_ffs.8
#   mksnap_ffs/mksnap_ffs.c
#
mkdir mksnap_ffs
echo x - mksnap_ffs/Makefile
sed 's/^X//' mksnap_ffs/Makefile  'END-of-mksnap_ffs/Makefile'
X# $FreeBSD$
X
XPROG=  mksnap_ffs
XMAN=   mksnap_ffs.8
X
X.if defined(NOSUID)
XBINMODE=550
X.else
XBINMODE=4550
XBINOWN=root
X.endif
XBINGRP=operator
X
X.include bsd.prog.mk
END-of-mksnap_ffs/Makefile
echo x - mksnap_ffs/mksnap_ffs.8
sed 's/^X//' mksnap_ffs/mksnap_ffs.8  'END-of-mksnap_ffs/mksnap_ffs.8'
X.\
X.\ Copyright (c) 2003 Networks Associates Technology, Inc.
X.\ All rights reserved.
X.\
X.\ This software was developed for the FreeBSD Project by Marshall
X.\ Kirk McKusick and Network Associates Laboratories, the Security
X.\ Research Division of Network Associates, Inc. under DARPA/SPAWAR
X.\ contract N66001-01-C-8035 (CBOSS), as part of the DARPA CHATS
X.\ research program.
X.\
X.\ Redistribution and use in source and binary forms, with or without
X.\ modification, are permitted provided that the following conditions
X.\ are met:
X.\ 1. Redistributions of source code must retain the above copyright
X.\notice, this list of conditions and the following disclaimer.
X.\ 2. Redistributions in binary form must reproduce the above copyright
X.\notice, this list of conditions and the following disclaimer in the
X.\documentation and/or other materials provided with the distribution.
X.\ 3. The names of the authors may not be used to endorse or promote
X.\products derived from this software without specific prior written
X.\permission.
X.\
X.\ THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
X.\ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
X.\ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
X.\ ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
X.\ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
X.\ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
X.\ OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
X.\ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
X.\ LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
X.\ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
X.\ SUCH DAMAGE.
X.\
X.\ $FreeBSD$
X.\
X.Dd January 19, 2003
X.Dt MKSNAP_FFS 8
X.Os
X.Sh NAME
X.Nm mksnap_ffs
X.Nd take a filesystem snapshot
X.Sh SYNOPSIS
X.Nm
X.Ar mountpoint
X.Ar snapshot_name
X.Sh DESCRIPTION
XThe
X.Nm
Xcommand creates a snapshot named
X.Ar snapshot_name
Xon the filesystem mounted at
X.Ar mountpoint .
XThe
X.Ar snapshot_name
Xargument must be contained within the filesystem mounted at
X.Ar mountpoint .
X.Pp
XThe group ownership of the file is set to
X.Dq operator ;
Xthe owner of the file remains
X.Dq root .
XThe mode of the snapshot is set to be readable by the owner
Xor members of the
X.Dq operator
Xgroup.
X.Sh SEE ALSO
X.Xr chmod 2 ,
X.Xr chown 8 ,
X.Xr mount_ffs 8
X.Sh HISTORY
XA
X.Nm
Xcommand first appeared in
X.Fx 5.0 .
END-of-mksnap_ffs/mksnap_ffs.8
echo x - mksnap_ffs/mksnap_ffs.c
sed 's/^X//' mksnap_ffs/mksnap_ffs.c  'END-of-mksnap_ffs/mksnap_ffs.c'
X/*
X * Copyright (c) 2003 Networks Associates Technology, Inc.
X * All rights reserved.
X *
X * This software was developed for the FreeBSD Project by Marshall
X * Kirk McKusick

Re: dump -L and privilege

2003-01-30 Thread Kirk McKusick
Date: Fri, 31 Jan 2003 02:24:00 +0200
From: Giorgos Keramidas [EMAIL PROTECTED]
To: Garrett Wollman [EMAIL PROTECTED]
Cc: Kirk McKusick [EMAIL PROTECTED],
[EMAIL PROTECTED]
Subject: Re: dump -L and privilege
X-ASK-Info: Confirmed by User

On 2003-01-30 15:52, Garrett Wollman [EMAIL PROTECTED] wrote:
 On Wed, 29 Jan 2003 18:17:31 -0800,
   Kirk McKusick [EMAIL PROTECTED] said:
  The other alternative would be to
  create a setuid-to-root program that would take a snapshot and
  chown it to the user that does dumps.

 I think this would actually be a useful feature for more than just
 dumps.  I might want to allow some users (say, those in group
 `operator') to be able to create snapshots on their own, without
 allowing arbitrary mounting privileges.

Do normal permissions apply for the files included in a snapshot?

It would be horrible from a security standpoint if any user could use
a setuid program to snapshot filesystems, mount the snapshot to places
of their own, and read random files from the mounted snapshot.

/knee jerk reaction

- Giorgos

By default snapshots are mode 400 owned by root, so normal users
cannot access them. The setuid program is proposing to make them
mode 440 group operator which would let anyone in the operator
group read them. This is the same level of permission given to
disks, so is neither more nor less secure than regular disks.
If the snapshot is mounted, then the same filesystem permissions
are enforced as would be enforced for the mounted disk except
that the mount must be done read-only, so nothing in the snapshot
can be moved, deleted, or changed.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: dump -L and privilege

2003-01-29 Thread Kirk McKusick
Date: Fri, 17 Jan 2003 09:08:09 +0900
From: Jun Kuriyama [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: Current [EMAIL PROTECTED]
Subject: dump -L and privilege
X-ASK-Info: Confirmed by User

I'm trying to use dump -L option to dump with snapshot on
-current/RELENG_5_0 family.

I found dump -L needs writable permission to the device (that's
reasonable because it *writes* snapshot file).  But when I try to dump
by operator group, it's impossible to dump with -L option (target
device has root:operator and crw-r-).

This behavior is understandable.  But in actual backup operations,
what should we do?  I'd like to hear what you thought in design.

(1) Do dump as root with -L option.
(2) Do chmod g+w for device.
(3) Other ideas?


-- 
Jun Kuriyama [EMAIL PROTECTED] // IMG SRC, Inc.
 [EMAIL PROTECTED] // FreeBSD Project

Sorry for the slow reply. I am just back from several weeks of
travel and am trying to get caught up on my email.

You have raised an important point here. By default (that is when
vfs.usermount == 0) only root is allowed to do mounts. Since dump -L
needs to do a snapshot, that can only be done by a root process.
I see two possible solutions to the problem. The first would be
to change the default for vfs.usermount == 1 and then have dump -L
create the snapshot in a directory owned by operator (or by
whatever user runs the dumps). Then the snapshot could be created,
used, and deleted by that user. The other alternative would be to
create a setuid-to-root program that would take a snapshot and
chown it to the user that does dumps. This setuid program could
then be invoked by dump -L to create a snapshot for it. I favor
the first approach, but there may be good security issues of which
I am unaware that make that a bad choice. Perhaps we could get
someone like Robert Watson to comment on these choices.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic: ffs_blkfree: freeing free block

2002-12-18 Thread Kirk McKusick
I corrected a botched patch last night. Make sure that you are running
with version 1.56 2002/12/18 07:19:41 of ufs/ffs/ffs_snapshot.c.

Kirk McKusick

=-=-=-=-=

Date: Wed, 18 Dec 2002 11:43:25 +0100
From: Aurelien Nephtali [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: panic: ffs_blkfree: freeing free block
X-ASK-Info: Confirmed by User

 I introduced a bug to snapshots on 11/30/02 which did not get fixed=20
 until 12/15/02 which caused background fsck to (silently) fail to fix
 certain filesystem problems. If you ran background fsck on a system
 between 11/30 and 12/15 and then ran background fsck again on a system
 after that date, the earlier missed corruption causes the panic that
 you have seen. Once fixed on a post 12/15 system, it should not recur.
 You can avoid the panic by running `fsck -f -p' on all your system
 after upgrading to a post 12/15 system. If you find continued
 evidence of trouble after following the above procedures, please
 send me mail.

   Kirk McKusick

I rebuild a brand new system and the problem is still here :/.

uname -a:
FreeBSD nebula.wanadoo.fr 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Wed Dec 18 10:45:30 CET 
2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/NEBULA  i386

I've also attached a new dump which matches to the new system.

-- Aurelien

--LQksG6bCIzRHxTLp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=dump
Content-Transfer-Encoding: quoted-printable

Script started on Wed Dec 18 11:36:02 2002
=0D=1B[m=1B[27m=1B[Jnebula# =1B[K=1B[61C/var/crash=1B[71Dg=08gdb -k=0D=0D
GNU gdb 5.2.1 (FreeBSD)=0D
Copyright 2002 Free Software Foundation, Inc.=0D
GDB is free software, covered by the GNU General Public License, and you ar=
e=0D
welcome to change it and/or distribute copies of it under certain condition=
s.=0D
Type show copying to see the conditions.=0D
There is absolutely no warranty for GDB.  Type show warranty for details.=
=0D
This GDB was configured as i386-undermydesk-freebsd.=0D
(kgdb) symbol-file kernel.debug.7 =0D
Reading symbols from kernel.debug.7...done.=0D
(kgdb) exec-file kernel.7=0D
(kgdb) core-file vmcore.7 =0D
panic: from debugger=0D
panic messages:=0D
---=0D
panic: ffs_blkfree: freeing free block=0D
panic: from debugger=0D
Uptime: 50s=0D
Dumping 123 MB=0D
ata0: resetting devices ..=0D
done=0D
 16 32 48 64 80 96 112=0D
---=0D
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:232=0D
232 dumping++;=0D
(kgdb) bt=0D
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:232=0D
#1  0xc021c37e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:3=
64=0D
#2  0xc021c5c3 in panic () at /usr/src/sys/kern/kern_shutdown.c:517=0D
#3  0xc013c212 in db_panic () at /usr/src/sys/ddb/db_command.c:450=0D
#4  0xc013c192 in db_command (last_cmdp=3D0xc03a0a00, cmd_table=3D0x0, =0D
aux_cmd_tablep=3D0xc039b53c, aux_cmd_tablep_end=3D0xc039b540)=0D
at /usr/src/sys/ddb/db_command.c:346=0D
#5  0xc013c2a6 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472=0D
#6  0xc013ef9a in db_trap (type=3D3, code=3D0) at /usr/src/sys/ddb/db_trap.=
c:72=0D
#7  0xc0335d42 in kdb_trap (type=3D3, code=3D0, regs=3D0xc850da4c)=0D
at /usr/src/sys/i386/i386/db_interface.c:166=0D
#8  0xc0346b2f in trap (frame=3D=0D
  {tf_fs =3D 24, tf_es =3D 16, tf_ds =3D 16, tf_edi =3D -1061658176, tf=
_esi =3D 256, tf_ebp =3D -934225256, tf_isp =3D -934225288, tf_ebx =3D 0, t=
f_edx =3D 0, tf_ecx =3D -1069390144, tf_eax =3D 18, tf_trapno =3D 3, tf_err=
 =3D 0, tf_eip =3D -1070374940, tf_cs =3D 8, tf_eflags =3D 646, tf_esp =3D =
-1069984152, tf_ss =3D -1070066283})=0D
at /usr/src/sys/i386/i386/trap.c:603=0D
#9  0xc0337558 in calltrap () at {standard input}:98=0D
#10 0xc021c5ab in panic (fmt=3D0x0) at /usr/src/sys/kern/kern_shutdown.c:50=
3=0D
#11 0xc02d8f0a in ffs_blkfree (fs=3D0xc18f3000, devvp=3D0xc191dce4, bno=3D1=
088, =0D
size=3D16384, inum=3D1088) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1771=0D
#12 0xc02e843f in indir_trunc (freeblks=3D0xc1b37500, dbn=3D4288, level=3D0=
, lbn=3D12, =0D
countp=3D0xc850dc10) at /usr/src/sys/ufs/ffs/ffs_softdep.c:2600=0D
#13 0xc02e7ee5 in handle_workitem_freeblocks (freeblks=3D0xc1b37500, flags=
=3D0)=0D
---Type return to continue, or q return to quit---=0D
at /usr/src/sys/ufs/ffs/ffs_softdep.c:2466=0D
#14 0xc02e510a in process_worklist_item (matchmnt=3D0x0, flags=3D0)=0D
at /usr/src/sys/ufs/ffs/ffs_softdep.c:742=0D
#15 0xc02e4e50 in softdep_process_worklist (matchmnt=3D0x0)=0D
at /usr/src/sys/ufs/ffs/ffs_softdep.c:621=0D
#16 0xc026f89e in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1751=0D
#17 0xc0208c64 in fork_exit (callout=3D0xc026f5b0 sched_sync, arg=3D0x0, =
=0D
frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:872=0D
(kgdb) quit=0D
=0D=1B[m=1B[27m=1B[Jnebula# =1B[K=1B[61C/var/crash=1B[71De=08exit=0D=0D

Script done on Wed Dec 18 11:36:32 2002
=0D

--LQksG6bCIzRHxTLp--

--BwCQnh7xodEAoBMC
Content-Type: application/pgp-signature

Re: panic: ffs_blkfree: freeing free block

2002-12-18 Thread Kirk McKusick
I have managed to panic my system on a hard reboot and now believe
that I have found the problem on which you are faulting. I have
checked in a fix to the head of the tree (sys/ufs/ffs/ffs_snapshot.c
version 1.57). Let me know if it fixes your problem.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Data corruption in soft updates?

2002-12-17 Thread Kirk McKusick
Please send me a `dumpfs /usr | head -50' output of the filesystem
under the current system. Then clean it up with fsck and run the
same command again. Finally, boot up under the old kernel and
get the output both before and after fsck cleaning. What I am
looking for is changes in the reported size of the filesystem
because that getting out of sync is what is causing these problems.
The basic deal is that the old UFS1 superblock stored the filesystem
size in a 32-bit field. The new UFS1 superblock stores the filesystem
size in a new (previously unused) 64-bit field. When you mount a
UFS1 filesystem on a new kernel, it copies the 32-bit size field
to the 64-bit field. At that point the filesystem size is in both
places and should work equally well on old or new kernels. However,
it does not update the 64-bit size field on any of the alternate
superblocks. So, somehow, your using and copying an alternate into
the standard location is losing the update done for the size field.
I am not sure how that is happening, but I am hoping to catch
where in all your messing around with alternates that is happening
so I can cover that hole.

Kirk McKusick

=-=-=-=-=-=

Date: Tue, 17 Dec 2002 12:14:12 -0800 (PST)
From: Nate Lawson [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Re: Data corruption in soft updates? 
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

On Mon, 9 Dec 2002, Kirk McKusick wrote:
 It appears that you are getting all those errors (BAD block)
 because fsck thinks that your filesystem is smaller than it
 really is. If you do a dumpfs on the filesystem and check
 the size (about line 5), I expect that you will find that
 all those bad blocks exceed that size. It might be interesting
 to check one or more of the alternate blocks to see if they
 have a different size. If so, using an alternate should help.
 If not, then the question is why all those out of range blocks 
 were allocated.

I booted an older kernel (Dec. 4) and ran fsck_ffs -b 32.  It repaired a
few simple errors (summary info bad).  I then copied the alt sblock to the
default location with dd.  I reran fsck to make sure the sblock was copied
correctly and it came up clean.  Everything was fine.

I rebooted into multiuser with the old kernel and everything worked fine.  
I did a full buildkernel with srcs as of yesterday at 5 pm without any bad
block messages.  But after rebooting with that new kernel, it tried to
correct the sblockloc again and my system started having the same problem
again.  uname and dmesg is below.

-Nate

FreeBSD 5.0-CURRENT #1: Mon Dec 16 18:05:56 PST 2002

/: correcting fs_sblockloc from 4 to 8192
bad block 1553167, ino 386832
/usr: optimization changed from TIME to SPACE
bad block 1553152, ino 387421
pid 42 (syncer), uid 0 inumber 387421 on /usr: bad block
bad block 1551181, ino 383169
pid 42 (syncer), uid 0 inumber 383169 on /usr: bad block
bad block 1632087, ino 383281
pid 42 (syncer), uid 0 inumber 383281 on /usr: bad block
bad block 1616355, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1623472, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1551227, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1552592, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1555160, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1555208, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1550776, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1551208, ino 383198
pid 42 (syncer), uid 0 inumber 383198 on /usr: bad block
bad block 1551209, ino 383241
pid 42 (syncer), uid 0 inumber 383241 on /usr: bad block
bad block 1553153, ino 387219
pid 42 (syncer), uid 0 inumber 387219 on /usr: bad block
bad block 1552704, ino 389415
pid 42 (syncer), uid 0 inumber 389415 on /usr: bad block
bad block 1552707, ino 390100
pid 42 (syncer), uid 0 inumber 390100 on /usr: bad block
bad block 1639665, ino 391119
pid 42 (syncer), uid 0 inumber 391119 on /usr: bad block
bad block 1553170, ino 39
pid 42 (syncer), uid 0 inumber 39 on /usr: bad block
bad block 1553431, ino 391118
pid 42 (syncer), uid 0 inumber 391118 on /usr: bad block
bad block 1553405, ino 391122
pid 42 (syncer), uid 0 inumber 391122 on /usr: bad block


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic: ffs_blkfree: freeing free block

2002-12-17 Thread Kirk McKusick
 Date: Mon, 16 Dec 2002 22:42:07 -0600
 From: Dan Nelson [EMAIL PROTECTED]
 To: Aurelien Nephtali [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: panic: ffs_blkfree: freeing free block
 
 In the last episode (Dec 16), Aurelien Nephtali said:
  Hi,
  
  I got a panic today which occured during a background fsck, after a
  hard-reboot of the system. The dump from gdb is attached and I can,
  of course, provide more infos if needed.
 
 Me too.  My info attached as well; almost identical stack trace.
 Kernel was built from sources cvsupped just after 2002/12/15 17:41:07
 PST.  (Why in the heck are all the timestamps in commitlogs in PST??)
 
 -- 
   Dan Nelson
   [EMAIL PROTECTED]

I introduced a bug to snapshots on 11/30/02 which did not get fixed 
until 12/15/02 which caused background fsck to (silently) fail to fix
certain filesystem problems. If you ran background fsck on a system
between 11/30 and 12/15 and then ran background fsck again on a system
after that date, the earlier missed corruption causes the panic that
you have seen. Once fixed on a post 12/15 system, it should not recur.
You can avoid the panic by running `fsck -f -p' on all your system
after upgrading to a post 12/15 system. If you find continued
evidence of trouble after following the above procedures, please
send me mail.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-17 Thread Kirk McKusick
Date: Mon, 9 Dec 2002 11:19:13 -0800
From: Brooks Davis [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Cc: Brooks Davis [EMAIL PROTECTED], Nate Lawson [EMAIL PROTECTED],
   Archie Cobbs [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)

On Fri, Dec 06, 2002 at 05:52:38PM -0800, Kirk McKusick wrote:
 Adding a two minute delay before starting background fsck
 sounds like a very good idea to me. Please send me your
 suggested change.

Here it is.  As written it doesn't add the delay, but you can change
etc/defaults/rc.conf to do that it desired.

-- Brooks

I have added your suggested change to -current (6.0). I decided to
set the default startup delay to sixty seconds as that seems to be
enough time to let the initial system startup settle down. If this
change proves to be popular, it can be considered for MFC'ing to 5.0.

Kirk McKusick

=-=-=-=-=-=

From: Kirk McKusick [EMAIL PROTECTED]
Date: Tue, 17 Dec 2002 23:21:31 -0800 (PST)
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: cvs commit: src/etc rc src/etc/defaults rc.conf src/etc/rc.d
 bgfsck src/share/man/man5 rc.conf.5
X-FreeBSD-CVS-Branch: HEAD

mckusick2002/12/17 23:21:31 PST

  Modified files:
etc  rc 
etc/defaults rc.conf 
etc/rc.d bgfsck 
share/man/man5   rc.conf.5 
  Log:
  Delay an optional amount of time after booting before starting a
  background fsck. The delay defaults to sixty seconds to allow
  large applications such as the X server to start before disk I/O
  bandwidth is monopolized by fsck.
  
  Submitted by:   Brooks Davis [EMAIL PROTECTED]
  Sponsored by:   DARPA  NAI Labs.
  
  Revision  ChangesPath
  1.165 +1 -0  src/etc/defaults/rc.conf
  1.324 +8 -2  src/etc/rc
  1.3   +13 -2 src/etc/rc.d/bgfsck
  1.168 +5 -0  src/share/man/man5/rc.conf.5

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Panic with recent CURRENT (1 hour ago)

2002-12-15 Thread Kirk McKusick
There was a problem with snapshots that lead to incomplete checking
by background fsck which in turn could lead to the problem that you
were seeing (i.e., repeated failures until fsck was run manually).
This problem was fixed with version 1.54 of ufs/ffs/ffs_snapshot.c
which was checked in on Dec 14, 2002. Please verify that you are
running with this version. If you had this problem after that
conversion please contact me directly so I can try and work out
more of the details.

Kirk McKusick

=-=-=-=-=-=

Date: Sat, 14 Dec 2002 21:47:20 +0100 (CET)
From: Martin Blapp [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Panic with recent CURRENT (1 hour ago)
X-ASK-Info: Confirmed by User

Hi Kirk,

Panic message was: Block already free. I had to fsck -y manually,
but nothing special was found and fixed. The machine rebooted over
and over and paniced always at the same place. This shouln't happen
I guess.

#10 0xc02f055b in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:503
#11 0xc03f9a8a in ffs_blkfree (fs=0xcc27b800, devvp=0xcc284384, bno=370040,
size=16384, inum=1400)
at /usr/src/sys/ufs/ffs/ffs_alloc.c:1771
#12 0xc0408fcf in indir_trunc (freeblks=0xcc586500, dbn=1481056, level=0,
lbn=45068, countp=0xe6928c10)
at /usr/src/sys/ufs/ffs/ffs_softdep.c:2603
#13 0xc0408f94 in indir_trunc (freeblks=0xcc586500, dbn=1480064, level=1,
lbn=4108, countp=0xe6928c10)
at /usr/src/sys/ufs/ffs/ffs_softdep.c:2599
#14 0xc0408a75 in handle_workitem_freeblocks (freeblks=0xcc586500, flags=0) at
/usr/src/sys/ufs/ffs/ffs_softdep.c:2469
#15 0xc0405c9a in process_worklist_item (matchmnt=0x0, flags=0) at
/usr/src/sys/ufs/ffs/ffs_softdep.c:745
#16 0xc04059e0 in softdep_process_worklist (matchmnt=0x0) at
/usr/src/sys/ufs/ffs/ffs_softdep.c:624
#17 0xc034337e in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1749
#18 0xc02dcc14 in fork_exit (callout=0xc0343090 sched_sync, arg=0x0,
frame=0x0) at /usr/src/sys/kern/kern_fork.c:872

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP: finger -l [EMAIL PROTECTED]
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Data corruption in soft updates?

2002-12-09 Thread Kirk McKusick
Date: Mon, 9 Dec 2002 18:04:03 -0800 (PST)
From: Nate Lawson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Data corruption in soft updates?
X-ASK-Info: Whitelist match

I rebuilt my kernel with today's current + the acpica-20021122
patch and rebooted.  I use ufs1, no acls or special options
other than SU (installed with DP1).  Everything booted fine
with some errors from acpi but as booting proceeded, I
started getting kernel messages of bad inode.  I quickly
rebooted to single user and ran fsck and got a huge set of
errors.  See this partial log (600KB gzipped):

   http://www.root.org/~nate/fsck.gz

I didn't touch all those files (just booted and started
getting errors) so I don't want to say yes to deleting
them.  Do I have to newfs/reinstall?  Should I try using a
superblock backup?

-Nate

It appears that you are getting all those errors (BAD block)
because fsck thinks that your filesystem is smaller than it
really is. If you do a dumpfs on the filesystem and check
the size (about line 5), I expect that you will find that
all those bad blocks exceed that size. It might be interesting
to check one or more of the alternate blocks to see if they
have a different size. If so, using an alternate should help.
If not, then the question is why all those out of range blocks 
were allocated.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-08 Thread Kirk McKusick
Date: Sat, 7 Dec 2002 11:07:23 -0800 (PST)
From: Nate Lawson [EMAIL PROTECTED]
To: Archie Cobbs [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
X-ASK-Info: Whitelist match

On Fri, 6 Dec 2002, Archie Cobbs wrote:
 Julian Elischer wrote:
  I put a copy of / in /usr
  then from the fixit, I mounted /usr as / and ran growfs from there..
  the trick is to not do it while / is mounted.
 
 / wasn't mounted yet when I ran growfs:
 
   I ran growfs after booting single user mode but before mounting
   any disks.. perhaps that caused it to not work.
 
 But it was the root partition and I was running in single user mode.
 If that's a problem then the growfs man page should say so, or maybe
 it should be more clear about what is meant by mounted.

growfs won't work with any mounted fs (even ro) because it needs to
quiesce kenrel file ops and you can't do that from usermode (yet).  I
wonder if there might be some clever way to abuse snapshots to have this
same effect (i.e. keep an open handle to the underlying fs cdev for growfs
to use and then mount a snapshot of the fs over its own mountpoint for
procs to use.)
 
 In any case, running it from the fixit floppy didn't work either
 (got a core dump), but that may be because it was already screwed up.
 
 So at minimum, there's a documentation bug (IMHO).

I assume the superblock changes between 4 and 5 changed the ability to use
4.x growfs on 5.x ufs partitions.  Also, does growfs need to be updated
for ufs2?

-Nate

I have made the structural changes to growfs to make it work for
UFS2, however, I have not done more than cursory testing. I would
appreciate it if someone could try running it on various UFS2
filesystems to see if it works properly.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-08 Thread Kirk McKusick
In theory the MNT_RELOAD command should reload all the filesystem
metadata properly though this feature has not been tested with
growfs. If anyone has the time to try it out and report back any
problems, that would be appreciated.

Kirk McKusick

=-=-=-=-=

From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Bruce Evans [EMAIL PROTECTED]
Date: Sun, 8 Dec 2002 17:03:43 -0800 (PST)
CC: Archie Cobbs [EMAIL PROTECTED],
   Kirk McKusick [EMAIL PROTECTED],
   Julian Elischer [EMAIL PROTECTED], [EMAIL PROTECTED],
   Thomas-Henning von Kamptz [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Bruce Evans wrote:
   Er, it should be obvious that growfs can't reasonably work on the mounted
   partitions.  growfs.1 doesn't exist, but growfs.8 already has the warning
   in a general form:
  
  Currently growfs can only enlarge unmounted file systems.  Do not
try enlarging a mounted file system, your system may panic and you will
not be able to use the file system any longer...
 
  Well, I suspected that it might not work... but I would disagree that it
  was *obvious* that it would not work. This was before mount had been
  run, so / was supposedly mounted (?) read-only.
 
 Perhaps the unobvious point is that fsck could work.  If the mount is r/w,
 then neither growfs nor fsck can even open the partition r/w.  fsck somehow
 works in the case of a r/o root, but growfs apparently doesn't.  I think
 fsck depends on no other processes making (significant) vfs syscalls for
 on the same partition while it is running (even r/o ones might be harmful
 if they caused reads of metadata which might be inconsistent).  Then when
 fsck has finished it calls mount(... MNT_RELOAD...) to sync the metadata.
 growfs doesn't do this, and even if it did it is not clear that it does
 all the necessary syncing (growfs may change more or different metadata).
 However, I think it does most of the necessary things.

FYI, I submitted a bug/enhancement request to summarize this..

  http://www.freebsd.org/cgi/query-pr.cgi?pr=46110

-Archie

P.S. Why does submitting a bug now generate an email response from
 (and who the heck is) ThinkHost Support ??

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-07 Thread Kirk McKusick
Thanks for reminding me about your userland change to background
fsck. I have tried it out and concur that it is the right approach
until we manage to get the general solution in the kernel. I
suggest that you propose it to release engineering and if approved
check it in.

Kirk McKusick

=-=-=-=-=-=

To: Kirk McKusick [EMAIL PROTECTED]
cc: Brooks Davis [EMAIL PROTECTED], Nate Lawson [EMAIL PROTECTED],
   Archie Cobbs [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd) 
In-Reply-To: Your message of Fri, 06 Dec 2002 17:52:38 PST.
 [EMAIL PROTECTED] 
Date: Sat, 07 Dec 2002 14:26:39 +
From: Ian Dowse [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

In message [EMAIL PROTECTED], Kirk McKusick wr
ites:
Adding a two minute delay before starting background fsck
sounds like a very good idea to me. Please send me your
suggested change.

BTW, I've been using a fsck_ffs modificaton for a while now that
does something like the disabled kernel I/O slowdown, but from
userland. It seems to help quite a lot in leaving some disk bandwidth
for other processes. Waiting a while before starting the fsck seems
like a good idea anyway though. Patch below (I think I posted an
earlier version of this before).

Ian

Index: fsutil.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/fsck_ffs/fsutil.c,v
retrieving revision 1.19
diff -u -r1.19 fsutil.c
--- fsutil.c27 Nov 2002 02:18:57 -  1.19
+++ fsutil.c4 Dec 2002 02:16:28 -
@@ -40,6 +40,7 @@
 #endif /* not lint */
 
 #include sys/param.h
+#include sys/time.h
 #include sys/types.h
 #include sys/sysctl.h
 #include sys/disklabel.h
@@ -62,7 +63,13 @@
 
 #include fsck.h
 
+static void slowio_start(void);
+static void slowio_end(void);
+
 long   diskreads, totalreads;  /* Disk cache statistics */
+struct timeval slowio_starttime;
+int slowio_delay_usec = 1; /* Initial IO delay for background fsck */
+int slowio_pollcnt;
 
 int
 ftypeok(union dinode *dp)
@@ -350,10 +357,15 @@
 
offset = blk;
offset *= dev_bsize;
+   if (bkgrdflag)
+   slowio_start();
if (lseek(fd, offset, 0)  0)
rwerror(SEEK BLK, blk);
-   else if (read(fd, buf, (int)size) == size)
+   else if (read(fd, buf, (int)size) == size) {
+   if (bkgrdflag)
+   slowio_end();
return (0);
+   }
rwerror(READ BLK, blk);
if (lseek(fd, offset, 0)  0)
rwerror(SEEK BLK, blk);
@@ -463,6 +475,39 @@
idesc.id_blkno = blkno;
idesc.id_numfrags = frags;
(void)pass4check(idesc);
+}
+
+/* Slow down IO so as to leave some disk bandwidth for other processes */
+void
+slowio_start()
+{
+
+   /* Delay one in every 8 operations by 16 times the average IO delay */
+   slowio_pollcnt = (slowio_pollcnt + 1)  7;
+   if (slowio_pollcnt == 0) {
+   usleep(slowio_delay_usec * 16);
+   gettimeofday(slowio_starttime, NULL);
+   }
+}
+
+void
+slowio_end()
+{
+   struct timeval tv;
+   int delay_usec;
+
+   if (slowio_pollcnt != 0)
+   return;
+
+   /* Update the slowdown interval. */
+   gettimeofday(tv, NULL);
+   delay_usec = (tv.tv_sec - slowio_starttime.tv_sec) * 100 +
+   (tv.tv_usec - slowio_starttime.tv_usec);
+   if (delay_usec  64)
+   delay_usec = 64;
+   if (delay_usec  100)
+   delay_usec = 100;
+   slowio_delay_usec = (slowio_delay_usec * 63 + delay_usec)  6;
 }
 
 /*

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Nate Lawson [EMAIL PROTECTED]
Date: Fri, 6 Dec 2002 10:57:13 -0800 (PST)
CC: Kirk McKusick [EMAIL PROTECTED],
   Archie Cobbs [EMAIL PROTECTED], [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Nate Lawson wrote:
  Does the background fsck process continue to run, or does the whole
  system come to a halt? If the fsck process continues to run, what 
  happens when it eventually finishes? Is the system still dead, or 
  does it come back to life? If the system does not come back to life
  can you get me the output of `ps axl'? If not, can you break into
  the debugger and get a ps output? (You will need to have the DDB
  option specified in your config file).
 
 Sorry for butting in.  I think Archie is referring to bg fsck gaining
 an unfair share of cpu due to it running due to IO completions. Last I
 heard, we were waiting until after 5.0 to experiment with scheduler
 changes to make it more fair.  I have not seen any hard locks or other
 problems with bg fsck after your commit.

I'm actually seeing something different. The box becomes unresponsive
(except for virtual console changes and CTRL-ALT-ESC) but there's no
disk activity. It never recovers.

Reproduced it again just now. After pulling the plug and rebooting
I didn't touch the box.  It booted normally, started background
fsck, and the HDD light was blinking as expected. After about 10
seconds, rather suddenly the HDD light stopped blinking.  At this
point it was pretty dead.  Broke into the debugger and it showed a
similar 'ps' output to what I previously posted.

-Archie

Your ps shows fsck_ufs and the syncer process both blocked on nbufbs.
That means the system has blocked them from running bacause it feels
that there are too many dirty buffers. What you are probably experiencing
is that you have a relatively small memory machine which has a rather
low threshhold for blocking on dirty buffers. All the dirty buffers
in your system are held by the indirect blocks of the snapshot and
thus the bufdaemon cannot push them out. That task can only be done
by the syncer who is also blocked. Could you please run the following
command on your system and send me the results:

sysctl vfs.lodirtybuffers
sysctl vfs.hidirtybuffers
sysctl vfs.numdirtybuffers

both before and after the lockup. If you cannot run this command after
the lockup, the global variable names are:

lodirtybuffers
hidirtybuffers
numdirtybuffers

If my hypothesis is correct, that will let me tweek the thrshholds on
dirty buffers to get a solution.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UFS1 created by 5.0 is incompatible with 4.0's?

2002-12-06 Thread Kirk McKusick
Date: Fri, 6 Dec 2002 18:06:03 +0200
From: Ruslan Ermilov [EMAIL PROTECTED]
To: Petr Holub [EMAIL PROTECTED], Matt Dillon [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: UFS1 created by 5.0 is incompatible with 4.0's?
X-ASK-Info: Whitelist match

On Fri, Dec 06, 2002 at 01:59:11PM +0100, Petr Holub wrote:
 Hi,
=20
  While testing the 4.0 - 5.0 upgrade path, I've created (under
  5.0) a UFS1 partition and installed 4.0 onto it.  After booting
  the 4.0 from it, kernel complained about ``numdirs is zero, try
  using an alternate superblock'' for / partition -- I've tried
  what it suggests (by fsck -b 32, etc.) but the result was always
  the same -- the file system was marked dirty and only read-only
  usable.  After rebooting in 5.0, this file system was similarly
  unusable.  Is this a bug or a feature?
=20
 I've discussed this issue with Poul-Henning Kamp. You need fsck
 from at least 4.7.
=20
Is this handled by fsck/setup.c,v 1.17.2.4 commit?

: revision 1.17.2.4
: date: 2002/06/24 05:10:41;  author: dillon;  state: Exp;  lines: +26 -56
: MFC 1.30.  Check only the fields we know should be the same between the
: primary and alternate superblocks, so fsck doesn't barf on new features
: added to UFS in later releases.
:=20
: Submitted by: mckusick


Cheers,
--=20
Ruslan Ermilov  Sysadmin and DBA,
[EMAIL PROTECTED]   Sunbay Software AG,
[EMAIL PROTECTED]  FreeBSD committer,
+380.652.512.251Simferopol, Ukraine

http://www.FreeBSD.org  The Power To Serve
http://www.oracle.com   Enabling The Information Age

If the 1.17.2.4 commit does not solve your problem, try the following
patch that I made to the 5.0 fsck. If it solves your problem, then it
should probably be MFC'ed.

Kirk McKusick


Index: sbin/fsck_ffs/setup.c
===
RCS file: /usr/ncvs/src/sbin/fsck_ffs/setup.c,v
retrieving revision 1.41
diff -c -r1.41 setup.c
*** setup.c 2002/11/27 02:18:57 1.41
--- setup.c 2002/12/04 23:13:18
***
*** 258,269 
(unsigned)(sizeof(struct inostatlist) * (sblock.fs_ncg)));
goto badsb;
}
!   numdirs = sblock.fs_cstotal.cs_ndir;
dirhash = numdirs;
-   if (numdirs == 0) {
-   printf(numdirs is zero, try using an alternate superblock\n);
-   goto badsb;
-   }
inplast = 0;
listmax = numdirs + 10;
inpsort = (struct inoinfo **)calloc((unsigned)listmax,
--- 258,265 
(unsigned)(sizeof(struct inostatlist) * (sblock.fs_ncg)));
goto badsb;
}
!   numdirs = MAX(sblock.fs_cstotal.cs_ndir, 128);
dirhash = numdirs;
inplast = 0;
listmax = numdirs + 10;
inpsort = (struct inoinfo **)calloc((unsigned)listmax,

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
The loss of files under soft updates is possible if your editor
fails to fsync the new file before unlinking the old file. The
`vi' editor always does an `fsync' after writing the new copy and
before removing the old copy. I have not checked with other editors
such as emacs to see if they properly use fsync. Note that there
is also a vulnerability without soft updates, it is just that the
window of vulnerability is shorter. So, editors should always do
fsync's, it is just more critical if you are using soft updates (or
journalling for that matter).

The main reason for not using soft updates on the root filesystem
was because of the delay between removing files and having the
space show up. The result was that world installs on the root
filesystem often failed if the root was nearly full (as is so
often the case). That problem has now been fixed in 5.0 with a
callback to soft updates if a filesystem full error is about to
be generated. When called back, soft updates expedites the freeing
of space so that the new allocation can succeed. So, the primary
reason for not using soft updates on the root is now fixed. If
however, mainline editors are not doing fsync's, then there is
still a good reason not to use soft updates on the root filesystem.

Kirk McKusick

=-=-=-=-=

From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Dan Nelson [EMAIL PROTECTED]
Date: Fri, 6 Dec 2002 11:28:52 -0800 (PST)
CC: [EMAIL PROTECTED], [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Dan Nelson wrote:
  Why does softupdates not get enabled on / , by default on the
  install?
 
 Softupdates updates on-disk structures in the background, and
 background fsck cannot relink unreferenced files into lost+found, so
 you run the risk of losing both the original and backup copies of
 important files in case of a sudden reboot.  Imagine you edited
 /etc/rc.conf, saved it, and 5 seconds later the system panic'ed.
 Because the default metadata flush time is 28 seconds, there's a pretty
 good chance that neither the new file or the original is in /etc after
 a reboot.  I got bit by this three times before I learned my lesson.  I

I don't understand this.. presumably vi updates the file contents by
opening and writing into the file; why would this cause the file's
directory entry to disappear?

On the other hand, if you do mv rc.conf.new rc.conf then you are
supposedly guaranteed that the file exists in some form; see rename(2).

In any case, you seem to be implying that with respect to modifying
files just before a system crash:

(a) Softupdates is more 'dangerous' than non-softupdates
(b) Background fsck is more 'dangerous' than normal fsck

Is this really true? I thought if anything the reverse of (a) would be true.

-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Date: Fri, 6 Dec 2002 13:01:20 -0800 (PST)
CC: Archie Cobbs [EMAIL PROTECTED], Nate Lawson [EMAIL PROTECTED],
   [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Kirk McKusick wrote:
 by the syncer who is also blocked. Could you please run the following
 command on your system and send me the results:
 
   sysctl vfs.lodirtybuffers
   sysctl vfs.hidirtybuffers
   sysctl vfs.numdirtybuffers
 
 both before and after the lockup. If you cannot run this command after
 the lockup, the global variable names are:
 
   lodirtybuffers
   hidirtybuffers
   numdirtybuffers

Before (system running normally):

vfs.lodirtybuffers: 126
vfs.hidirtybuffers: 252
vfs.numdirtybuffers: 0

After:

vfs.lodirtybuffers: 126
vfs.hidirtybuffers: 252
vfs.numdirtybuffers: 445

-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

OK, it looks like my hypothesis on having a small number of buffers 
and running out of them is the problem. I enclose below a patch which
should check for the problem arising and help to mitigate it. I
would appreciate you dropping it into your kernel and seeing if
it solves your problem. The fix is not ideal, but merely to see
if it solves this problem. If it does, I will figure out how to
do it properly. Thanks for your help.

Kirk McKusick

Index: sys/buf.h
===
RCS file: /usr/ncvs/src/sys/sys/buf.h,v
retrieving revision 1.138
diff -c -r1.138 buf.h
*** sys/buf.h   2002/08/30 04:04:37 1.138
--- sys/buf.h   2002/12/06 21:44:25
***
*** 468,473 
--- 468,474 
  caddr_t   kern_vfs_bio_buffer_alloc(caddr_t v, long physmem_est);
  void  bufinit(void);
  void  bwillwrite(void);
+ int   checkdirtybufs(struct vnode *);
  int   buf_dirty_count_severe(void);
  void  bremfree(struct buf *);
  int   bread(struct vnode *, daddr_t, int, struct ucred *, struct buf **);
Index: kern/vfs_bio.c
===
RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v
retrieving revision 1.342
diff -c -r1.342 vfs_bio.c
*** kern/vfs_bio.c  2002/11/23 19:10:30 1.342
--- kern/vfs_bio.c  2002/12/06 21:44:35
***
*** 1114,1119 
--- 1114,1137 
  }
  
  /*
+  * Check to see if a vnode holds too many dirty buffers. If it does,
+  * flush it.
+  */
+ int
+ checkdirtybufs(struct vnode *vp)
+ {
+   struct buf *bp;
+   int dirtycnt = 0, error = 0;
+   struct thread *td = curthread;
+ 
+   TAILQ_FOREACH(bp, vp-v_dirtyblkhd, b_vnbufs)
+   dirtycnt++;
+   if (dirtycnt  lodirtybuffers)
+   error = VOP_FSYNC(vp, td-td_ucred, MNT_NOWAIT, td);
+   return (error);
+ }
+ 
+ /*
   * Return true if we have too many dirty buffers.
   */
  int
Index: ufs/ffs/ffs_balloc.c
===
RCS file: /usr/ncvs/src/sys/ufs/ffs/ffs_balloc.c,v
retrieving revision 1.39
diff -c -r1.39 ffs_balloc.c
*** ufs/ffs/ffs_balloc.c2002/10/22 01:14:25 1.39
--- ufs/ffs/ffs_balloc.c2002/12/06 21:49:56
***
*** 295,300 
--- 295,301 
if (bp-b_bufsize == fs-fs_bsize)
bp-b_flags |= B_CLUSTEROK;
bdwrite(bp);
+   checkdirtybufs(vp);
}
}
/*
***
*** 335,340 
--- 336,342 
if (bp-b_bufsize == fs-fs_bsize)
bp-b_flags |= B_CLUSTEROK;
bdwrite(bp);
+   checkdirtybufs(vp);
}
*bpp = nbp;
return (0);
***
*** 756,761 
--- 758,764 
if (bp-b_bufsize == fs-fs_bsize)
bp-b_flags |= B_CLUSTEROK;
bdwrite(bp);
+   checkdirtybufs(vp);
}
}
/*
***
*** 796,801 
--- 799,805 
if (bp-b_bufsize == fs-fs_bsize)
bp-b_flags |= B_CLUSTEROK;
bdwrite(bp);
+   checkdirtybufs(vp);
}
*bpp = nbp;
return (0);

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
I suggest that we drag Thomas-Henning von Kamptz into this
discussion as he was one of the main authors of growfs. He
is copied on my reply.

Kirk McKusick

=-=-=-=-=-=

From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Julian Elischer [EMAIL PROTECTED]
Date: Fri, 6 Dec 2002 14:52:24 -0800 (PST)
CC: [EMAIL PROTECTED], [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Julian Elischer wrote:
 most systems follow / with their swap region..
 
 you can boot from fixit, or picoBSD floppy 
 and use disklabel -e to exend the root partition
 then you can use growfs to add the new space to your root fs.

Hmm.. I tried that and it didn't seem to work.

The disklabel change was successful, but growfs didn't seem to
expand the root partition any.. df(1) still shows it as 50M.

I ran growfs after booting single user mode but before mounting
any disks.. perhaps that caused it to not work.

Since that didn't work, I booted a 4.7-REL fixit floppy and tried
to run growfs from there, but then that growfs core dumped:

Program terminated with signal 11, Segmentation fault.
#0  0x804c089 in updclst (block=-874) at growfs.c:2335
2335setbit(cg_clustersfree(acg), block);
(gdb) list
2330return;
2331}
2332/*
2333 * update cluster allocation map
2334 */
2335setbit(cg_clustersfree(acg), block);
2336
(gdb) where
#0  0x804c089 in updclst (block=-874) at growfs.c:2335
#1  0x8049584 in updjcg (cylno=2, utime=1039185218, fsi=4, fso=3, Nflag=0)
at growfs.c:862
#2  0x8048280 in growfs (fsi=4, fso=3, Nflag=0) at growfs.c:219
#3  0x804beb2 in main (argc=2, argv=0xbfbff7a4) at growfs.c:2213
#4  0x8048135 in _start ()

Notice block=-874 which indicates something is weird or corrupted.

So now I've got extra space in the partition which (apparently) is
not being used and I can't seem to get at it (see below).

Plus I have a sneaking suspicion that I've screwed up something,
but there's nothing in the growfs man page that indicates what I
did was wrong.

FYI, this is a test machine so it's OK if it gets hosed.

-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

$ disklabel ad0s1
# /dev/ad0s1c:
type: ESDI
disk: ad0s1
label: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 1860
sectors/unit: 29896902
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # milliseconds
track-to-track seek: 0  # milliseconds
drivedata: 0 

8 partitions:
#size   offsetfstype   [fsize bsize bps/cpg]
  a:   20480004.2BSD 1024  8192 32768   # (Cyl.0 - 12*)
  b:   164608   204800  swap# (Cyl.   12*- 22*)
  c: 298969020unused0 0 # (Cyl.0 - 1860*)
  e:40960   3694084.2BSD 1024  819216   # (Cyl.   22*- 25*)
  f: 29486534   4103684.2BSD 1024  819216   # (Cyl.   25*- 1860*)
$ df
Filesystem 1K-blocksUsedAvail Capacity  Mounted on
/dev/ad0s1a49583   36751 886681%/
devfs  1   10   100%/dev
/dev/ad0s1f 14289643 2794938 1035153421%/usr
/dev/ad0s1e1981535551467520%/var
procfs 4   40   100%/proc


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Date: Fri, 6 Dec 2002 15:23:36 -0800 (PST)
CC: Archie Cobbs [EMAIL PROTECTED], Nate Lawson [EMAIL PROTECTED],
   [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

Kirk McKusick wrote:
 OK, it looks like my hypothesis on having a small number of buffers 
 and running out of them is the problem. I enclose below a patch which
 should check for the problem arising and help to mitigate it. I
 would appreciate you dropping it into your kernel and seeing if
 it solves your problem. The fix is not ideal, but merely to see
 if it solves this problem. If it does, I will figure out how to
 do it properly. Thanks for your help.

Yep, that fixes it. Now I just get the usual sluggishness while the
background fsck runs (which is not too bad), but it eventually
finishes and then all is well.

Thanks,
-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

Thanks for verifying that the idea works. I will attempt to figure
out how to do it correctly and submit a proposed fix.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-06 Thread Kirk McKusick
Adding a two minute delay before starting background fsck
sounds like a very good idea to me. Please send me your
suggested change.

Kirk McKusick

=-=-=-=-=

Date: Fri, 6 Dec 2002 10:44:45 -0800
From: Brooks Davis [EMAIL PROTECTED]
To: Nate Lawson [EMAIL PROTECTED]
Cc: Kirk McKusick [EMAIL PROTECTED],
   Archie Cobbs [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
X-ASK-Info: Confirmed by User

On Fri, Dec 06, 2002 at 10:27:10AM -0800, Nate Lawson wrote:
 On Thu, 5 Dec 2002, Kirk McKusick wrote:
  Does the background fsck process continue to run, or does the whole
  system come to a halt? If the fsck process continues to run, what=20
  happens when it eventually finishes? Is the system still dead, or=20
  does it come back to life? If the system does not come back to life
  can you get me the output of `ps axl'? If not, can you break into
  the debugger and get a ps output? (You will need to have the DDB
  option specified in your config file).
=20
 Sorry for butting in.  I think Archie is referring to bg fsck gaining an
 unfair share of cpu due to it running due to IO completions.  Last I
 heard, we were waiting until after 5.0 to experiment with scheduler
 changes to make it more fair.  I have not seen any hard locks or other
 problems with bg fsck after your commit.

My experience is that, at least with my laptop (which has a very slow
disk), bg fsck works OK, but starting applictions for the first time
while fsck is running is _very_ painful.  Even getty seems to have a
hard time.  I've found that adding a two minute delay before the fsck is
sufficent to allow the system to finish starting up and for me to load X
and my main applictions which lets me work while bg fsck is running.  I
posted a patch to add an optional delay in the rc scripts a while ago,
but Kirk was going to re-enable the priority stuff soon so I didn't
persue it.  If there's intrest, I'll regenerate it and repost it.

-- Brooks

Any statement of the form X is the one, true Y is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-05 Thread Kirk McKusick
Date: Thu, 5 Dec 2002 15:22:27 -0800 (PST)
From: Archie Cobbs [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: backgroud fsck is still locking up system

Just rebuilt -current this morning. Background fsck is still
causing a soft lockup. I thought the conclusion was we were
going to disable it for 5.0.

Not trying to rush anyone, just pointing out that this
still needs to be done..

-Archie

__
Archie Cobbs*Packet Design*http://www.packetdesign.com

What do you mean by background fsck causing a soft lockup?
Is it failing? Is it deadlocking the system? Do you have a
specific test case that shows the problem? Needless to say
it is working fine on my system and on my regression tests.
The only problem that I am having with 5.0 as of last night
is getting login to work on my console.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-05 Thread Kirk McKusick
Does the background fsck process continue to run, or does the whole
system come to a halt? If the fsck process continues to run, what 
happens when it eventually finishes? Is the system still dead, or 
does it come back to life? If the system does not come back to life
can you get me the output of `ps axl'? If not, can you break into
the debugger and get a ps output? (You will need to have the DDB
option specified in your config file).

Kirk McKusick

=-=-=-=-=-=

From: Archie Cobbs [EMAIL PROTECTED]
Subject: Re: backgroud fsck is still locking up system (fwd)
In-Reply-To: [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
Date: Thu, 5 Dec 2002 16:22:20 -0800 (PST)
CC: Archie Cobbs [EMAIL PROTECTED], Robert Watson [EMAIL PROTECTED],
   [EMAIL PROTECTED]
X-ASK-Info: Confirmed by User

Kirk McKusick wrote:
   Just rebuilt -current this morning. Background fsck is still
   causing a soft lockup. I thought the conclusion was we were
   going to disable it for 5.0.
 
 What do you mean by background fsck causing a soft lockup?
 Is it failing? Is it deadlocking the system? Do you have a
 specific test case that shows the problem? Needless to say
 it is working fine on my system and on my regression tests.
 The only problem that I am having with 5.0 as of last night
 is getting login to work on my console.

What happens is that at first I can login, but the system seems
slow. I then got as far as running 'top' but it never refreshed its
display and subsequently all keystrokes were ignored. Changing
virtual terminals still works OK, but they are effectively dead too.
I'm imagining processes getting stuck on some lock one by one.

Top did get as far as showing the background fsck process, which
had a priority of -6 or something.

The previous time it didn't even spit out a login prompt, but
that may just be due to experimental noise.

For me, it appears easy to reproduce...

1. Boot -current system
2. Pull the power cable out
3. Put the power cable back in
4. Let the box boot; it notes backgroud fsck
5. Login and try to do something

I can give you more details about my system separately if you like.

Thanks,
-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



newfs chokes, cores, dies if inode density too high; patch attached

2002-11-30 Thread Kirk McKusick
Date: Fri, 1 Nov 2002 00:43:38 +
From: Ceri Davies [EMAIL PROTECTED]
To: David Wolfskill [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: newfs chokes, cores,  dies if inode density too high;
 patch attached
In-Reply-To: [EMAIL PROTECTED]

I don't have time to test this right now, but see also PR bin/30959.

Ceri
-- 
you can't see when light's so strong
you can't see when light is gone


Better late than never, this bug has been fixed.

From: Kirk McKusick [EMAIL PROTECTED]
Date: Sat, 30 Nov 2002 10:28:26 -0800 (PST)
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: cvs commit: src/sbin/newfs mkfs.c newfs.c

mckusick2002/11/30 10:28:26 PST

  Modified files:
sbin/newfs   mkfs.c newfs.c 
  Log:
  Add some more checks to newfs so that it will not build filesystems
  that the kernel will refuse to mount. Specifically it now enforces
  the MAXBSIZE blocksize limit. This update also fixes a problem where
  newfs could segment fault if the selected fragment size was too large.
  
  PR: bin/30959
  Submitted by:   Ceri Davies [EMAIL PROTECTED]
  Sponsored by:   DARPA  NAI Labs.
  
  Revision  ChangesPath
  1.66  +24 -14src/sbin/newfs/mkfs.c
  1.66  +5 -1  src/sbin/newfs/newfs.c

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UFS Snapshot deadlock

2002-11-30 Thread Kirk McKusick
Your deadlock should now be fixed.

Kirk McKusick

=-=-=-=-=

From: Kirk McKusick [EMAIL PROTECTED]
Date: Fri, 29 Nov 2002 23:27:12 -0800 (PST)
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: cvs commit: src/sys/ufs/ffs ffs_snapshot.c
X-FreeBSD-CVS-Branch: HEAD

mckusick2002/11/29 23:27:12 PST

  Modified files:
sys/ufs/ffs  ffs_snapshot.c 
  Log:
  Fix two deadlocks in snapshots:
  
  1) Release the snapshot file lock while suspending the system. Otherwise
 a process trying to read the lock may block on its containing directory
 preventing the suspension from completing. Thanks to Sean Kelly
 [EMAIL PROTECTED] for finding this deadlock.
  
  2) Replace some bdwrite's with bawrite's so as not to fill all the
 buffers with dirty data. The buffers could not be cleaned as the
 snapshot vnode was locked hence the system could deadlock when
 making snapshots of really massive filesystems. Thanks to
 Hidetoshi Shimokawa [EMAIL PROTECTED] for figuring
 this out.
  
  Sponsored by:   DARPA  NAI Labs.
  
  Revision  ChangesPath
  1.51  +7 -2  src/sys/ufs/ffs/ffs_snapshot.c

=-=-=-=-=-=

Date: Wed, 30 Oct 2002 03:57:52 -0600
From: Sean Kelly [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: UFS Snapshot deadlock

While playing with UFS snapshots on a UFS2 filesystem I mounted
specifically for this purpose, I encountered a little problem. It seems I
have processes deadlocked on each other.

Steps to repeat:
/# mount /dev/ad2a /mnt ; cd /mnt
/dev/ad2a on /mnt (ufs, local, soft-updates, multilabel) # UFS2
/mnt# cd /mnt; mount -u -o snapshot /mnt/snapshot /mnt

*switch vtys*

/# cd /mnt; ls -l
*ls deadlocks*
*I get bored and ^C the mount on the other vty about 30 minutes later*
/mnt# ls 
*this ls deadlocks too*

For the record, /mnt was a new filesystem. It had *nothing* in it. No
directories or anything.

So now, I've got these:
  UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT   TIME COMMAND
0  1133   669   0  -4  0   692  548 ufsD+v10:00.00 ls
 1001   939   856   0  -4  0   696  560 ufsD+v20:00.00 ls -l
0   937 1   0  -4  0   560  336 ufsD v10:00.65 mount -u -o 
snapshot /mnt/snapshot /mnt


Now for some numbers.

db trace 937
mi_switch(c71aab60,50,c03375c6,c7,c03ad2f8) at mi_switch+0x158
msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4
acquire(c75098dc,140,600,e6,3a9) at acquire+0xa7
lockmgr(c75098dc,1010002,c7509818,c71aab60,e5b076a8) at lockmgr+0x2f7
vop_stdlock(e5b076c4,e5b076e0,c021e306,e5b076c4,0) at vop_stdlock+0x2c
ufs_vnoperate(e5b076c4,0,c033dd28,e5b076e0,c01ba4a5) at ufs_vnoperate+0x18
vn_lock(c7509818,10002,c71aab60,815,c7509818) at vn_lock+0xd6
vget(c7509818,2,c71aab60,470,0) at vget+0xd6
ffs_sync(c74c5400,1,c726a780,c71aab60,c74f1000) at ffs_sync+0x126
vfs_write_suspend(c74c5400,c74ffcb8,d351f08c,1,c2c06e80) at vfs_write_suspend+0x70
ffs_snapshot(c74c5400,bfbffd1d,70,c033990d,252) at ffs_snapshot+0xa48
ffs_mount(c74c5400,c745ce80,bfbff000,e5b07bf0,c71aab60) at ffs_mount+0x548
vfs_mount(c71aab60,c6d2b780,c745ce80,101,bfbff000) at vfs_mount+0x85e
mount(c71aab60,e5b07d14,c03590ba,409,4) at mount+0xb8
syscall(2f,2f,2f,bfbfeffc,bfbff9f4) at syscall+0x22e
Xint0x80_syscall() at Xint0x80_syscall+0x1d

db trace 939
mi_switch(c74260d0,50,c03375c6,c7,1cc) at mi_switch+0x158
msleep(c74ffd7c,c03a9688,50,c034f732,0) at msleep+0x3b4
acquire(c74ffd7c,140,600,e6,3ab) at acquire+0xa7
lockmgr(c74ffd7c,1010002,c74ffcb8,c74260d0,e5bfd83c) at lockmgr+0x2f7
vop_stdlock(e5bfd858,e5bfd874,c021e306,e5bfd858,246) at vop_stdlock+0x2c
ufs_vnoperate(e5bfd858,246,0,c74f1000,0) at ufs_vnoperate+0x18
vn_lock(c74ffcb8,10002,c74260d0,7f,3) at vn_lock+0xd6
vget(c74ffcb8,10002,c74260d0,7f,c74260d0) at vget+0xd6
ufs_ihashget(c74cce00,3,2,e5bfd98c,e5bfd8f0) at ufs_ihashget+0xd2
ffs_vget(c74c5400,3,2,e5bfd98c,e5bfd994) at ffs_vget+0x44
ufs_lookup(e5bfdac0,e5bfdafc,c0207a24,e5bfdac0,e5bfdc3c) at ufs_lookup+0xdae
ufs_vnoperate(e5bfdac0,e5bfdc3c,e5bfdc50,3ab,c74260d0) at ufs_vnoperate+0x18
vfs_cache_lookup(e5bfdb70,e5bfdb9c,c020bd39,e5bfdb70,c7509818) at 
vfs_cache_lookup+0x2e4
ufs_vnoperate(e5bfdb70,c7509818,e5bfdc50,e5bfdb5c,c74260d0) at ufs_vnoperate+0x18
lookup(e5bfdc28,0,c033d6ad,a4,c74260d0) at lookup+0x309
namei(e5bfdc28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0
lstat(c74260d0,e5bfdd14,c03590ba,409,2) at lstat+0x52
syscall(2f,2f,2f,80d3200,80d1040) at syscall+0x22e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (190, FreeBSD ELF32, lstat), eip = 0x805838b, esp = 0xbfbff3dc, ebp = 
0xbfbff468 ---

db trace 1133
mi_switch(c6d31680,50,c03375c6,c7,2) at mi_switch+0x158
msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4
acquire(c75098dc,140,600,e6,46d) at acquire+0xa7
lockmgr(c75098dc,1030002,c7509818,c6d31680,e3887ad0) at lockmgr+0x2f7
vop_stdlock(e3887aec,e3887b08,c021e306,e3887aec,0) at vop_stdlock+0x2c
ufs_vnoperate(e3887aec,0,c033e1ac,360,c01e3af0) at ufs_vnoperate+0x18

Re: corrupted UFS2 label after ffs_vfsops.c,v 1.198

2002-11-30 Thread Kirk McKusick
Date: Sat, 30 Nov 2002 00:44:10 +0100 (CET)
From: Michael Reifenberger [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: FreeBSD-Current [EMAIL PROTECTED]
Subject: corrupted UFS2 label after ffs_vfsops.c,v 1.198

Hi,
after cvsupping a kernel with the mentioned version of ffs_vfsops.c
I tried to upgrade my kernel from a some weeks aged -current.
After that I'm no longer able to mount or fsck a UFS2 formatted
disk. My dmesg is attached.

Trying fsck_ffs /dev/da0s1a gives:
(nihil)(root) # fsck_ffs /dev/da0s1a
** /dev/da0s1a
Cannot find file system superblock

LOOK FOR ALTERNATE SUPERBLOCKS? [yn] y

Fließkommafehler
(floating point error in german)

Any possible alternate superblock given with -b gives
a fp-error also.

How to resolve this?

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS

Once you have upgraded your fsck to the current version, it will only
check converted UFS2 filesystems. To convert your UFS2 filesystem,
simply mount it with your new kernel. Once you have done that, you
will be able to unmount it and run the new fsck. Similarly, if you
have an older kernel (vintage last four months of -current) then it
will back-convert your UFS2 filesystems every time you run it and thus
you will have to forward convert before fsck will run on it again.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Trashed Disk Labels

2002-11-29 Thread Kirk McKusick
If you have updated your kernel sources on or after Nov 27th,
and are running with ufs/ffs/ffs_vfsops.c version 1.197, this
message applies to you.

I have had a report of a disk label getting trashed after booting
up to a kernel with the new UFS2 superblock format. I have just
checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that
explicitly checks to make sure that it will not trash your disk
label. I highly recommend that you update to this version, even if
you are only running with UFS1 filesystems. 

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Trashed Disk Labels

2002-11-29 Thread Kirk McKusick
Date: Fri, 29 Nov 2002 14:53:06 -0500 (EST)
From: Wesley Morgan [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Re: Trashed Disk Labels
X-ASK-Info: Confirmed by User

On Fri, 29 Nov 2002, Kirk McKusick wrote:

 I have had a report of a disk label getting trashed after booting
 up to a kernel with the new UFS2 superblock format. I have just
 checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that
 explicitly checks to make sure that it will not trash your disk
 label. I highly recommend that you update to this version, even if
 you are only running with UFS1 filesystems.

   Kirk McKusick

Great! Any tools available to extract my var/db/pkg dirs from this image
of my trashed UFS2 filesystem? :

What seems to work is to boot from CD-ROM, use

disklabel -r -w disk auto

to reinstall the default disklabel, then

disklabel -B disk

to put back the bootstrap. At that point your existing filesystems
should all come back. This of course assumes that you used the
orginal default partition sizes. If not, you will need to
figure them out and edit up an appropriate disk label.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Trashed Disk Labels

2002-11-29 Thread Kirk McKusick
Date: Sat, 30 Nov 2002 17:43:53 +1100 (EST)
From: Bruce Evans [EMAIL PROTECTED]
X-X-Sender: [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Re: Trashed Disk Labels
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

On Fri, 29 Nov 2002, Kirk McKusick wrote:

 I have had a report of a disk label getting trashed after booting
 up to a kernel with the new UFS2 superblock format. I have just
 checked in an update to ufs/ffs/ffs_vfsops.c (version 1.198) that
 explicitly checks to make sure that it will not trash your disk
 label. I highly recommend that you update to this version, even if
 you are only running with UFS1 filesystems.

Labels should be write protected, but this seems to have been broken
by GEOM.

Bruce

Disk labels certainly used to be write protected. Not sure when that
stopped, but it certainly would have been useful in this recent context.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Update to UFS2 Superblock Format

2002-11-29 Thread Kirk McKusick
You will have to ask Puol-Henning Kamp, but I do not believe that 
he has yet put together a bootstrap for the i386 platform that can 
boot from a UFS2 filesystem. As such, I believe that you are
required to have a UFS1 root on the i386 at this time. I have
copied Poul-Henning Kamp so that he can correct me if I am incorrect
on this point.

Kirk McKusick

=-=-=-=-=-=

Date: Fri, 29 Nov 2002 22:57:12 -0800
To: Kirk McKusick [EMAIL PROTECTED], [EMAIL PROTECTED]
From: Manfred Antar [EMAIL PROTECTED]
Subject: Re: Update to UFS2 Superblock Format
Cc: Robert Watson [EMAIL PROTECTED], [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Confirmed by User

At 09:11 PM 11/24/2002 -0800, Kirk McKusick wrote:
On Tuesday Nov 26th I plan to make an update to the UFS2
superblock. It will not affect UFS1 filesystems so should
be generally transparent to most -current users. For those
using UFS2 filesystems, the new kernel will update the
superblock to the new format the first time that your UFS2
filesystem is mounted read-write. Once updated it will not
be able to be mounted by older kernels unless the `zapsb'
program (see below) is run to revert it to the old format.

The only really noticable problem arises when you are booting
from a UFS2 root partition. Here, you must follow the following
steps:

1) boot new kernel
2) mount -u /
3) install new bootstrap

Once the new kernel has converted the filesystem format for the
root partition, the old bootstrap will no longer recognize it, so
if you do not have a new bootstrap, you will no longer be able to
boot from it. Note that you cannot update to the new bootstrap
until the filesystem has been converted as the new bootstrap will
not recognize the old superblock format. Again, this change will
only affect you if you are using a UFS2 filesystem as your root
filesystem.

The changes that I plan to apply can be viewed at:

http://www.freebsd.org/~mckusick/UFS2_update.diffs

The program `zapsb.c' that reverts a UFS2 filesystem to its
previous state can be found at:

http://www.freebsd.org/~mckusick/zapsb.c

If this change is going to cause you undue hardship, please
send me mail ([EMAIL PROTECTED]).

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message 

Kirk
With a kernel and system current as of Thurs night.
I did a dump of / , /var , /usr filesystems.
I did a disklabel -B da0s1
I did a make release and booted off the cdrom.
went into the fixit mode and did newfs -O2 /dev/da0s1a (root) /dev/da0s1e (/var) 
/dev/da0s1f (/usr)
I then did a restore of the file systems.
when i reboot somehow the bootstrap bypasses /boot/loader
Here is what I see on the screen

/boot.config -P
Invalid format

FreeBSD/i386/UFS1 BOOT
Default: 0:da(0,a)/kernel
boot:
WARNING: loader(8) metadata is missing!

I have a current kernel in the / directory so it boots that
and I get to the:
mountrootand do
mountroot ufs:da0s1a

I guess what ineed to know is how to install the UFS2 bootblocks
Thanks
Manfred
==
||  [EMAIL PROTECTED]   ||
||  Ph. (415) 681-6235  ||
== 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Update to UFS2 Superblock Format

2002-11-29 Thread Kirk McKusick
Date: Fri, 29 Nov 2002 23:16:51 -0800
To: Kirk McKusick [EMAIL PROTECTED]
From: Manfred Antar [EMAIL PROTECTED]
Subject: Re: Update to UFS2 Superblock Format 
Cc: [EMAIL PROTECTED], Robert Watson [EMAIL PROTECTED],
[EMAIL PROTECTED], Poul-Henning Kamp [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

At 11:11 PM 11/29/2002 -0800, Kirk McKusick wrote:
You will have to ask Puol-Henning Kamp, but I do not believe that 
he has yet put together a bootstrap for the i386 platform that can 
boot from a UFS2 filesystem. As such, I believe that you are
required to have a UFS1 root on the i386 at this time. I have
copied Poul-Henning Kamp so that he can correct me if I am incorrect
on this point.

Kirk McKusick

Ah
No wonder, I tried editing the /sys/boot/i386/boot2/Makefile
to enable UFS2 bootblock but then disklabel complained that
boot2 was too big. I will have to revert to UFS1
Thanks
Manfred

==
||  [EMAIL PROTECTED]   ||
||  Ph. (415) 681-6235  ||
== 

You have hit upon the exact problem. UFS2 has a much bigger area
reserved for the boot block, but the programs that set up disk labels
and boot blocks don't know about it yet so assume that they have to
cram into the much smaller UFS1 boot-block area.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UFS-2 partition destroyed by change

2002-11-28 Thread Kirk McKusick
I appreciate the offer to go through the whole upgrade
process again, but I don't think it is necessary. If there
were going to be many anguished folks that had to go through
it, I would have played out all the senarios and made sure
they worked. The point of doing this change now was to fix
problems with UFS2 before most people had deployed it. From
here on out, I promise not to introduce major breakage :-)

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UFS-2 partition destroyed by change

2002-11-27 Thread Kirk McKusick
If I understand correctly, you ran a new fsck on a UFS-2 that
had not yet been mounted by your new kernel. Thus you had a
version of fsck that expected a converted UFS-2 filesystem
that you had not yet converted (by mounting with the new kernel).
I would have expected it to fail with a bad superblock magic
number. Had you mounted it before running the new fsck, all would
have been well. I am at a loss to explain why fsck did not gag
and refuse to check it though.

Kirk McKusick

=-=-=-=-=

Date: Thu, 28 Nov 2002 06:01:36 + (GMT)
From: Daniel Flickinger [EMAIL PROTECTED]
To: FreeBSD-CURRENT [EMAIL PROTECTED]
Cc: Kirk McKusick [EMAIL PROTECTED]
Subject: UFS-2 partition destroyed by change
X-ASK-Info: Confirmed by User

I only had one UFS-2 partition, the backup root
partition on da1a. After McKusick's notice of change:

  Message-ID: [EMAIL PROTECTED]

for 26 Nov, I installed the kernel and world sliced at
1200 GMT 27 Nov.

As a matter of principle, I _always_ run fsck -y from
single user at reboot of a new world (which means every
day now) even though I have not had a crash --pardon me
for too many years of BSD, but habits stick

da1a was shredded; only lost+found:

  p1:da1a #535- ll lost+found/
  total 8
  0 br-xrw--wT  1 root  wheel0,   0 Jan  1  1970 #00455
  0 br-xrw--wT  1 root  wheel0,   0 Jan  1  1970 #00561
  0 br-xr-xr-x  1 root  wheel0,   0 Jan  1  1970 #00813
  0 br-xr-x--t  1 root  wheel0,   0 Jan  1  1970 #00865
  8 d-wSr-x--T  2 root  wheel  8192 Jan  1  1970 #01031

The directory is empty. No pipers for Last Post, but a
rather good sendoff of 80MB to bit heaven. No other
partition even whimpered and nothing really lost
since it was a duplicate of da0a.

I was about to convert the remaining 9 partitions to
UFS-2 when I read Kirk's notice and decided to wait. I'll
rebuild the da1a partition with UFS-2 (new and improved
version?) and see what happens tomorrow morning with the
1200 GMT 28 Nov slice.

If I do a 'disklabel -B da1' (I have a pair of
dangerously dedicated 9G 160 SCSIs), I presume that
/boot/mbr is now the correct new UFS-2 boot record? My
intention is to convert all parititions one-by-one,
except da0a, to UFS-2, and then 'boot -s' from da1 and
'dd' da0a since the disks are siamese twins.


  --
  Sanity is the Playground for the Unimaginative


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Update to UFS2 Superblock Format

2002-11-25 Thread Kirk McKusick
Date: Mon, 25 Nov 2002 01:08:30 -0800 (PST)
From: Julian Elischer [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], Robert Watson [EMAIL PROTECTED],
[EMAIL PROTECTED]
Subject: Re: Update to UFS2 Superblock Format 
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

On Sun, 24 Nov 2002, Kirk McKusick wrote:

 Some of these fields could usefully be made unsigned others not
 (for example fs_pendingblocks and fs_pendinginodes). So just
 going through and making everything unsigned is not the right
 approach. I will make a pass through and consider changing some
 of these fields once the tree opens back up, but not at this
 point in time when we are trying to keep changes to a minimum
 and do not have time for extensive testing.
 
   Kirk McKusick

I'm not in a hurry.. It's just something that I thought should be
considered. eventually.

BTW how can fs_pendingblocks and fs_pendinginodes be -ve?

In theory they should never go negative. But if an inconsistency
occurs (for example a crash and remount before background fsck
has run) the accounting can get out of whack and the numbers go
negative. We check for this happening and take corrective action.
If they were changed to unsigned, we would miss the negative
transition and instead suddenly think that we had a huge amount
of pending space to free. So this is an example where changing
them to unsigned would break existing code.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Update to UFS2 Superblock Format

2002-11-24 Thread Kirk McKusick
On Tuesday Nov 26th I plan to make an update to the UFS2
superblock. It will not affect UFS1 filesystems so should
be generally transparent to most -current users. For those
using UFS2 filesystems, the new kernel will update the
superblock to the new format the first time that your UFS2
filesystem is mounted read-write. Once updated it will not
be able to be mounted by older kernels unless the `zapsb'
program (see below) is run to revert it to the old format.

The only really noticable problem arises when you are booting
from a UFS2 root partition. Here, you must follow the following
steps:

1) boot new kernel
2) mount -u /
3) install new bootstrap

Once the new kernel has converted the filesystem format for the
root partition, the old bootstrap will no longer recognize it, so
if you do not have a new bootstrap, you will no longer be able to
boot from it. Note that you cannot update to the new bootstrap
until the filesystem has been converted as the new bootstrap will
not recognize the old superblock format. Again, this change will
only affect you if you are using a UFS2 filesystem as your root
filesystem.

The changes that I plan to apply can be viewed at:

http://www.freebsd.org/~mckusick/UFS2_update.diffs

The program `zapsb.c' that reverts a UFS2 filesystem to its
previous state can be found at:

http://www.freebsd.org/~mckusick/zapsb.c

If this change is going to cause you undue hardship, please
send me mail ([EMAIL PROTECTED]).

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Update to UFS2 Superblock Format

2002-11-24 Thread Kirk McKusick
Some of these fields could usefully be made unsigned others not
(for example fs_pendingblocks and fs_pendinginodes). So just
going through and making everything unsigned is not the right
approach. I will make a pass through and consider changing some
of these fields once the tree opens back up, but not at this
point in time when we are trying to keep changes to a minimum
and do not have time for extensive testing.

Kirk McKusick

=-=-=-=-=

Date: Sun, 24 Nov 2002 21:28:38 -0800 (PST)
From: Julian Elischer [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], Robert Watson [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: Update to UFS2 Superblock Format
In-Reply-To: [EMAIL PROTECTED]
X-ASK-Info: Whitelist match

I do have one question re: UFS2, not specifically about this change
however..

I notice that the fields of the disk structure are signed.
Wouldn;t it make more sence at this early stage to declare them as
unsigned?
For example 
take this snippet from struct fs
int64_t  fs_size;   /* number of blocks in fs */
int64_t  fs_dsize;  /* number of data blocks in fs
*/
ufs2_daddr_t fs_csaddr; /* blk addr of cyl grp summary
area */
int64_t  fs_pendingblocks;  /* blocks in process of being
freed */
int32_t  fs_pendinginodes;  /* inodes in process of being
freed */
int32_t  fs_snapinum[FSMAXSNAP];/* list of snapshot inode
numbers */
int32_t  fs_avgfilesize;/* expected average file size */ 
int32_t  fs_avgfpdir;   /* expected # of files per
directory */
int32_t  fs_save_cgsize;/* save real cg size to use
fs_bsize */
int32_t  fs_sparecon32[27]; /* reserved for future constants
*/
int32_t  fs_contigsumsize;  /* size of cluster summary array
*/
int32_t  fs_maxsymlinklen;  /* max length of an internal
symlink */
int32_t  fs_old_inodefmt;   /* format of on-disk inodes */
u_int64_t fs_maxfilesize;   /* maximum representable file
size */
int64_t  fs_qbmask; /* ~fs_bmask for use with 64-bit
size */
int64_t  fs_qfmask; /* ~fs_fmask for use with 64-bit
size */
int32_t  fs_state;  /* validate fs_clean field */
int32_t  fs_old_postblformat;   /* format of positional layout
tables */
int32_t  fs_old_nrpos;  /* number of rotational
positions */


How can any of these values be meaningfully -ve?

Making them signed just gives fsck a harder time to check the values.
(as we saw this week).


I have run a system with many of these made unsigned and it made
no difference to the system. It was binarily compatible too.
i.e it mounted existing filesystemd with no problems.

julian




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Stupid UFS2 questions...

2002-10-18 Thread Kirk McKusick
Date: Fri, 18 Oct 2002 23:06:53 +0200 (CEST)
From: BOUWSMA Beery [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Stupid UFS2 questions...

[IPv6-only address above; strip the obvious for IPv4-only mail replies]

In trying to track down a panic I had while mounting a newly-created
UFS2 filesystem, I noted that the `newfs' k0de had changed somewhat
from -stable to -current.  Specifically, that which determines the
value of `sbsize' which I'm guessing should be no larger than 8192
else mounts cause panics.

Here are the relevant lines from the last time I built -stable (mkfs.c):
 547 sblock.fs_sbsize = fragroundup(sblock, sizeof(struct fs));
 548 if (sblock.fs_sbsize  SBSIZE)
 549 sblock.fs_sbsize = SBSIZE;

If I'm not mistaken, this will give an upper limit of effectively 8192
to fs_sbsize, which does not appear to be the case with -current:

As seen in the RCS file just CVSup'ed (sbin/newfs/mkfs.c,v):
 840 errx(31, calloc failed);
 841 sblock.fs_sbsize = fragroundup(sblock, sizeof(struct fs));
 842 sblock.fs_minfree = minfree;
 843 sblock.fs_maxbpg = maxbpg;

There is no other reference to sbsize in the HEAD branch.

Now, as soon as I patched the build I did half a month ago as follows:
 386 if (fscs == NULL)
 387 errx(31, calloc failed);
 388 sblock.fs_sbsize = fragroundup(sblock, sizeof(struct fs));
 389 /* XXX HACKHACKHACK */
 390 if (sblock.fs_sbsize  SBLOCKSIZE)
 391 sblock.fs_sbsize = SBLOCKSIZE;
 392 sblock.fs_minfree = minfree;

that is, to match how -stable does this, I can create a filesystem with
fragment sizes larger than 8192 bytes (UFS2) which I can successfully
mount under -current, which, without this hack, would panic my machine.
`dumpfs' shows the value for sbsize no larger than 8192, while for the
problem filesystems it was 8192, as large as the fragment size.

Thus the question:  Is this the Right Thing[tm] to do?

Your fix is exactly the right thing to do. I have put it into -current.

Second question:  I have a drive where I first tried to create an ill-
fated UFS2 filesystem, because of the above panic which I had not yet
researched, so I gave up and created a UFS1 filesystem thereupon, and
filled it up.

It *seems* that I can mount this disk under -current and probably access
the UFS1 files within, but what was really weird was the `df' output
from this disk.  Said disk is 100% full under -stable, but -current
claims it is 0% full.  Sorry I don't have the actual outputs from this
command, but is it possible that the presence of the UFS2 superblock
is confusing -current when there's a UFS1 superblock and filesystem
present, and if -current is looking first for a UFS2 superblock and
finding one, is it possible to tell `mount' that I really want a UFS1
filesystem mount, and any remnants of UFS2 should be ignored?  According
to ufs/ffs/fs.h, the UFS1 superblock is at 8k while UFS2 is 64k from the
front, so apparently the UFS2 superblock that I initially created still
remains and confuses `df' and perhaps other things that I haven't tried
yet, as it didn't get wiped when I created the UFS1 filesystem.

So it seems.  Which makes one to wonder, if there are three superblocks
at three locations present, which to believe?  And how to nuke the
unwanted one(s)?


Insight appreciated.  Thanks.

barry bouwsma

In general you can move UFS1 filesystems back and forth between
-stable and -current. However, you must run an `fsck -f -p'
using the local version (e.g., the -stable fsck on a -stable
systems, and the -current fsck on a -current system) before
using the UFS1 filesystem. The reason is that -stable and
-current record free block information in different parts of
the superblock (32-bit counters for -stable and 64-bit counters
for -current) and do not maintain the alternate counter locations.
The local fsck will recalculate and correct the counters that
the local system uses.

Unless your blocksize was bigger than 64K, you would have
overwritten the UFS2 superblock with UFS1 inode blocks when you
created the UFS1 filesystem. I had originally put code in to
stomp out all other possible superblock locations when creating
a filesystem in newfs, but got in trouble as I ended up stomping
on boot block information that UFS2 filesystems place where the
UFS1 superblock used to reside. So, I deleted that code. Hope
this helps.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -current panic in suser_cred()

2002-06-24 Thread Kirk McKusick

I have put a fix in /sys/ufs/ufs/ufs_inode.c (on freefall) which
should solve this panic.

Kirk McKusick

=-=-=-=

From: Wesley Morgan [EMAIL PROTECTED]
Date: Mon, 24 Jun 2002 18:04:07 -0400 (EDT)
Subject: -current panic in suser_cred()
To: [EMAIL PROTECTED]

At some point between 20 Jun and (by my best guest) 22 Jun
there has been a problem introduced somewhere... How much
more vague can you get? :)...  File creation works fine,
but attempting to rm causes a panic.  config and dmesg (of
a non-panicking kernel) are attached, panic message and gdb
stuff below... Hope it's enough info to get a fix in the works!

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x4
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc019249c
stack pointer   = 0x10:0xdb467b4c
frame pointer   = 0x10:0xdb467b50
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 145 (rm)
panic: from debugger

#0  0xc019614b in doadump ()
#1  0xc01965db in boot (howto=260)
#2  0xc01967fb in panic ()
#3  0xc0139452 in db_panic ()
#4  0xc01393d2 in db_command (last_cmdp=0xc02fd2e0, cmd_table=0xc02fd100,
aux_cmd_tablep=0xc02f4c7c, aux_cmd_tablep_end=0xc02f4c80)
#5  0xc01394e6 in db_command_loop ()
#6  0xc013c07a in db_trap (type=12, code=0) at ../../../ddb/db_trap.c:76
#7  0xc0298dfe in kdb_trap (type=12, code=0, regs=0xdb467b0c)
at ../../../i386/i386/db_interface.c:214
#8  0xc02a9153 in trap_fatal (frame=0xdb467b0c, eva=4)
#9  0xc02a8e62 in trap_pfault (frame=0xdb467b0c, usermode=0, eva=4)
#10 0xc02a885a in trap (frame=
  {tf_fs = -1013055464, tf_es = 196624, tf_ds = 16, tf_edi = -1, tf_esi
= -1012546560, tf_ebp = -616137904, tf_isp = -616137928, tf_ebx = 0, tf_edx
= 0, tf_ecx = -1012854016, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip =
-1072094052, tf_cs = 8, tf_eflags = 66050, tf_esp = -1012854016, tf_ss =  
-616137864})at ../../../i386/i386/trap.c:659

--- begin interesting stuff ---

#11 0xc019249c in suser_cred (cred=0x0, flag=0)
#12 0xc025dab5 in chkiq (ip=0xc3a5c400, change=4294967295, cred=0x0,
flags=0)#13 0xc025b57f in ufs_inactive (ap=0xdb467be0)
at ../../../ufs/ufs/ufs_inode.c:132
#14 0xc0263a08 in ufs_vnoperate (ap=0xdb467be0)
#15 0xc01e01e5 in vput (vp=0xc3a59c00)
#16 0xc01e77c4 in unlink (td=0xc393c41c, uap=0xdb467d10)
#17 0xc02a948a in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077936696, tf_esi = 0,
tf_ebp = -1077936776, tf_isp = -616137356, tf_ebx = -1077936553, tf_edx =
-1077936508, tf_ecx = 47, tf_eax = 10, tf_trapno = 12, tf_err = 2, tf_eip =
134524795, tf_cs = 31, tf_eflags = 582, tf_esp = -1077936916, tf_ss = 47})
#18 0xc029a57d in syscall_with_err_pushed () at {standard input}:128
#19 0x0804839a in ?? ()
#20 0x08048145 in ?? ()




--=_20020624180407_17367
Content-Type: application/octet-stream; name=CATALYST
Content-Disposition: attachment; filename=CATALYST
Content-Transfer-Encoding: base64

bWFjaGluZQkJaTM4NgppZGVudAkJQ0FUQUxZU1QKbWF4dXNlcnMJNjQKb3B0aW9ucyAJTk1CQ0xV
U1RFUlM9MTYzODQKCm1ha2VvcHRpb25zCUNPTkZfQ0ZMQUdTPSItZm5vLWJ1aWx0aW4iCgpvcHRp
b25zIAlQUV9DQUNIRVNJWkU9NTEyCSMgY29sb3IgZm9yIDUxMmsvMTZrIGNhY2hlCm9wdGlvbnMg
CU1BTExPQ19QUk9GSUxFCgojICAgIHN0cmluZ3MgLWFvdXQgLW4gMyAva2VybmVsIHwgZ3JlcCBe
X19fIHwgc2VkIC1lICdzL15fX18vLycgPiBNWUtFUk5FTAojCm9wdGlvbnMgCUlOQ0xVREVfQ09O
RklHX0ZJTEUgICAgICMgSW5jbHVkZSB0aGlzIGZpbGUgaW4ga2VybmVsCgojIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMK
IyBDUFUgT1BUSU9OUwpjcHUJCUk2ODZfQ1BVCm9wdGlvbnMJCUNQVV9FTkFCTEVfU1NFCm9wdGlv
bnMgCUNQVV9TVVNQX0hMVAoKIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjCiMgQ09NUEFUSUJJTElUWSBPUFRJT05TICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgCm9wdGlvbnMgCUNPTVBB
VF80MwoKb3B0aW9ucyAJU1lTVlNITQpvcHRpb25zIAlTWVNWU0VNCm9wdGlvbnMgCVNZU1ZNU0cK
Cm9wdGlvbnMgCUREQgojb3B0aW9ucyAJRERCX1VOQVRURU5ERUQKb3B0aW9ucyAJS1RSQUNFCQkJ
I2tlcm5lbCB0cmFjaW5nCgojb3B0aW9ucyAJVVNFUkNPTkZJRwkJI2Jvb3QgLWMgZWRpdG9yCiNv
cHRpb25zIAlWSVNVQUxfVVNFUkNPTkZJRwkjdmlzdWFsIGJvb3QgLWMgZWRpdG9yCgojIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMKIyBORVRXT1JLSU5HIE9QVElPTlMKb3B0aW9ucyAJSU5FVAkJCSNJbnRlcm5ldCBjb21t
dW5pY2F0aW9ucyBwcm90b2NvbHMKI29wdGlvbnMgCUlORVQ2CQkJI0lQdjYgY29tbXVuaWNhdGlv
bnMgcHJvdG9jb2xzCm9wdGlvbnMgCUlQU0VDCQkJI0lQIHNlY3VyaXR5Cm9wdGlvbnMgCUlQU0VD
X0VTUAkJI0lQIHNlY3VyaXR5IChjcnlwdG87IGRlZmluZSB3LyBJUFNFQykKCmRldmljZSAJCWV0
aGVyCQkJI0dlbmVyaWMgRXRoZXJuZXQKZGV2aWNlIAkJbG9vcAkJCSNOZXR3b3JrIGxvb3BiYWNr
IGRldmljZQpkZXZpY2UgCQlicGYgCQkJI0JlcmtlbGV5IHBhY2tldCBmaWx0ZXIKI2RldmljZSAg
CXR1bgkJCSNUdW5uZWwgZHJpdmVyIChwcHAoOCksIG5vcy10dW4oOCkpCgojZGV2aWNlCQlnaWYJ
NAkJI0lQdjYgYW5kIElQdjQgdHVubmVsaW5nCiNkZXZpY2UJCWZhaXRoCTEJCSNmb3IgSVB2NiBh

Re: you should probably track current@ these days...

2002-06-22 Thread Kirk McKusick

Date: Sat, 22 Jun 2002 07:49:17 -0700 (PDT)
From: David Wolfskill [EMAIL PROTECTED]
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: UFS2, superblocks, and UFS compatability

I had noted phk's message about Kirk's commitatron being
readied for action; I was, however, slightly surprised by
some of the results of yesterday's (daily) build of -CURRENT.

I had expected (and a quick, ex post facto, look at Kirk's
commit message confirms) that the intent was to maintain
compatability with UFS1.

What surprised me was that after building yesterday's
-CURRENT successfully (actually, without incident at all),
I poked around a bit, then re-booted to -STABLE.  (I'm
tracking both -STABLE and -CURRENT on the same physical
disk, using different slices.  Although I build -CURRENT
daily, and do some reality checks, I still do the bulk
of the work with the machines in question -- such as anything
that updates the CVS repository -- in -STABLE.  This probably
reflects a rather conservative nature on my part.)  The
surprise was that -STABLE's fsck was rather expressively
unhappy about the file systems that had been mounted while
running -CURRENT.

In particular, STABLE's fsck claimed that the superblock
for each such file system was corrupt.

I found (empirically) that running fsck (and allowing it
to recover the superblock from the backup superblock at 32)
for each of these files systems, then rebooting, made STABLE
much less unhappy.  :-}

The -CURRENT code appears to be able to use a UFS1 file
system OK -- I was able to boot back to -CURRENT again in
preparation for building today's -CURRENT -- but it appears
to me (and I haven't looked at the code to verify this)
that something in the superblock is getting updated in a
way that isn't completely compatable with UFS1, at least
if the file system is updated.

If this is intended, mention of it in UPDATING might be
useful.  If it's not, I'll be happy to help narrow down
where things go awry and test the results of (proposed)
patches.  (Whether they are patches to -CURRENT, -STABLE,
or both.)

Cheers,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Trying to support or use Microsoft products makes about as much sense
as painting the outside of a house with watercolors.

My hope was that you would be able to switch painlessly between
new and old systems. To make this work, I made a change to fsck
on April 7th:

RCS file: /usr/ncvs/src/sbin/fsck_ffs/setup.c,v:

revision 1.30
date: 2002/04/07 05:16:33;  author: mckusick;  state: Exp;  lines: +25 -61
When checking the alternate superblock, we used to copy any fields
that might have changed, then did a byte-by-byte comparison with
the alternate. If any unused fields got used, they had to be added
to the exception list. Such changes caused too many false alarms.
So, I have changed the comparison algorithm to compare a selected
set of fields that are not expected to change. This new algorithm
causes far fewer false hits and still does a good job of detecting
problems when they have really occurred. In particular, this change
should ease the transition to kernels supporting UFS2 which make
some significant changes to the superblock.

Sponsored by:   DARPA, NAI Labs

This was supposed to get MFC'ed back to 4.X, though I am not sure if
that ever happened. Because of the breakup of fsck into fsck and fsck_ffs
I am not sure how one goes back and makes changes to whet used to be
fsck/setup.c. Anyway, if your fsck_ffs is running with a copy of setup.c
that predates this change, then it will bitch about the superblock being
corrupted and recover by using the first alternate. You can avoid the
bitching by using `fsck -b16 ...' to override the integrity check.
Given that you have had the problem, I expect that others will as well,
so I will make a note in the UPDATING notes to suggest the use of
`fsck -b16 ...' when going back to using filesystems on 4.X systems.

Kirk McKusick

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UFS2 related message?

2002-06-22 Thread Kirk McKusick

The problem with running out of inodes is now fixed and checked in
on freefall. You need to pick up /sys/ufs/ffs/ffs_alloc.c revision
1.94 or later. For those that care, the log entry describing the
problem:


revision 1.94
date: 2002/06/22 21:24:58;  author: mckusick;  state: Exp;  lines: +2 -2
This patch fixes a problem whereby filesystems that ran
out of inodes in a cylinder group would fail to check for
free inodes in other cylinder groups. This bug was introduced
in the UFS2 code merge two days ago.

An inode is allocated by calling ffs_valloc which calls
ffs_hashalloc to do the filesystem scan. Ffs_hashalloc
walks around the cylinder groups calling its passed allocator
(ffs_nodealloccg in this case) until the allocator returns a
non-zero result. The bug is that ffs_hashalloc expects the
passed allocator function to return a 64-bit ufs2_daddr_t.
When allocating inodes, it calls ffs_nodealloccg which was
returning a 32-bit ino_t. The ffs_hashalloc code checked
a 64-bit return value and usually found random non-zero bits in
the high 32-bits so decided that the allocation had succeeded
(in this case in the only cylinder group that it checked).
When the result was passed back to ffs_valloc it looked at
only the bottom 32-bits, saw zero and declared the system
out of inodes. But ffs_hashalloc had really only checked
one cylinder group.

The fix is to change ffs_nodealloccg to return 64-bit results.

Sponsored by:   DARPA  NAI Labs.
Submitted by:   Poul-Henning Kamp [EMAIL PROTECTED]
Reviewed by:Maxime Henrion [EMAIL PROTECTED]


Kirk McKusick

=-=-=-=-=-=

To: [EMAIL PROTECTED]
Subject: UFS2 related message?
Date: Sat, 22 Jun 2002 20:45:01 +0900
From: Munehiro Matsuda [EMAIL PROTECTED]

Hello all,

After the import of UFS2 patch into -current, I get the following messages.

pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes
pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes
pid 397 (perl), uid 123 inumber 682496 on /home: out of inodes

Is it related to UFS2 by anyway?

FYI, here's what got with my disks.

% df -i
Filesystem  1K-blocksUsed   Avail Capacity iused   ifree %iused  Mounted on
/dev/ad0s2a254063   91341  14239739%2615   608714%   /
devfs   1   1   0   100%   0   0  100%   /dev
/dev/ad0s3e   7185161 4473874 213647568%  227116 1574354   13%   /home
/dev/ad0s2f   2787666 1668475  89617865%  176288  522078   25%   /usr
/dev/ad0s2e254063   10456  223282 4%1653   618333%   /var
procfs  4   4   0   100%   1   0  100%   /proc
linprocfs   4   4   0   100%   1   0  100%   
/usr/compat/linux/proc
/dev/ad0s13663652 2542176 112147669%   0   0  100%   /dos
%

Thanks in advance,
  Haro
=--
   _ _Munehiro (haro) Matsuda
 -|- /_\  |_|_|   Business Incubation Dept., Kubota Corp.
 /|\ |_|  |_|_|   1-3 Nihonbashi-Muromachi 3-Chome
  Chuo-ku Tokyo 103-8310, Japan
  Tel: +81-3-3245-3318  Fax: +81-3-3245-3315
  Email: [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Unconnected files problem

2002-02-11 Thread Kirk McKusick

I have (finally) found and fixed this problem. You need to get
version 1.107 or later of /sys/ufs/ffs/ffs_softdep.c (2002/02/07).

Kirk McKusick

=-=-=-=-=-=

Date: Tue, 28 Aug 2001 14:02:24 +0200
From: Ollivier Robert [EMAIL PROTECTED]
To: FreeBSD Current Users' list [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Unconnected files problem

I have a script that generates index for all my mail messages (using
glimpse). Sometimes, the disk is full because it has some rather big
temporary files (and I have a lot of mail).

It seems that we may have a softupdate-related (that's a guess from me)
problem because some of these temporaty files end up as unconnected to any
directory but link count is still one and they still takes space. The last
time fsck ran on the filesystem, it gave me back more than 6 (!!)
fragments (cf the following:

-=-=-
Aug 23 12:21:38 caerdonn root: /dev/da0s1g: Reclaimed: 0 directories, 22 files, 60424 
fragments 
Aug 23 12:21:38 caerdonn root: /dev/da0s1g: 10295 files, 387087 used, 73408 free (1048 
frags, 9045 blocks, 0.2% fragmentation) 
-=-=-

lsof doesn't show them so they're not open by any process.

The mtime of the files are exactly when the glimpseindex command is run. We
know that SU has some issues when a filesystem is full but this is quite a
problem because as you can see below, I'm losing a lot of space till the
next reboot...

UNREF FILE I=1081  OWNER=roberto MODE=100600
SIZE=523 MTIME=Aug 28 00:46 2001 
CLEAR? no

UNREF FILE  I=18498  OWNER=roberto MODE=100600
SIZE=230665 MTIME=Aug 26 08:05 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18508  OWNER=roberto MODE=100600
SIZE=11225707 MTIME=Aug 23 20:02 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18530  OWNER=roberto MODE=100600
SIZE=28322748 MTIME=Aug 24 20:09 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18573  OWNER=roberto MODE=100600
SIZE=28326193 MTIME=Aug 25 20:09 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18575  OWNER=roberto MODE=100600
SIZE=18684173 MTIME=Aug 24 20:08 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=19204  OWNER=roberto MODE=100600
SIZE=13771800 MTIME=Aug 26 08:05 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=19353  OWNER=roberto MODE=100600
SIZE=18679309 MTIME=Aug 25 20:08 2001 
RECONNECT? no

CLEAR? no

** Phase 5 - Check Cyl groups
10223 files, 446324 used, 74595 free (1019 frags, 9197 blocks, 0.2% fragmentation)

fsdb (inum: 2) inode 19353
current inode: regular file
I=19353 MODE=100600 SIZE=18679309
MTIME=Aug 25 20:08:18 2001 [0 nsec]
CTIME=Aug 25 20:08:18 2001 [0 nsec]
ATIME=Aug 25 20:08:11 2001 [0 nsec]
OWNER=roberto GRP=staff LINKCNT=1 FLAGS=0 BLKCNT=8ec0 GEN=4c2a6c10

-- 
Ollivier ROBERT  -=-  Eurocontrol EEC/ITM  -=-  [EMAIL PROTECTED]
FreeBSD caerdonn.eurocontrol.fr 5.0-CURRENT #46: Wed Jan  3 15:52:00 CET 2001

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Unconnected files problem

2002-02-10 Thread Kirk McKusick

I have (finally) found and fixed this problem. You need to get
version 1.107 or later of /sys/ufs/ffs/ffs_softdep.c (2002/02/07).

Kirk McKusick

=-=-=-=-=-=

Date: Tue, 28 Aug 2001 14:02:24 +0200
From: Ollivier Robert [EMAIL PROTECTED]
To: FreeBSD Current Users' list [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Unconnected files problem

I have a script that generates index for all my mail messages (using
glimpse). Sometimes, the disk is full because it has some rather big
temporary files (and I have a lot of mail).

It seems that we may have a softupdate-related (that's a guess from me)
problem because some of these temporaty files end up as unconnected to any
directory but link count is still one and they still takes space. The last
time fsck ran on the filesystem, it gave me back more than 6 (!!)
fragments (cf the following:

-=-=-
Aug 23 12:21:38 caerdonn root: /dev/da0s1g: Reclaimed: 0 directories, 22 files, 60424 
fragments 
Aug 23 12:21:38 caerdonn root: /dev/da0s1g: 10295 files, 387087 used, 73408 free (1048 
frags, 9045 blocks, 0.2% fragmentation) 
-=-=-

lsof doesn't show them so they're not open by any process.

The mtime of the files are exactly when the glimpseindex command is run. We
know that SU has some issues when a filesystem is full but this is quite a
problem because as you can see below, I'm losing a lot of space till the
next reboot...

UNREF FILE I=1081  OWNER=roberto MODE=100600
SIZE=523 MTIME=Aug 28 00:46 2001 
CLEAR? no

UNREF FILE  I=18498  OWNER=roberto MODE=100600
SIZE=230665 MTIME=Aug 26 08:05 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18508  OWNER=roberto MODE=100600
SIZE=11225707 MTIME=Aug 23 20:02 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18530  OWNER=roberto MODE=100600
SIZE=28322748 MTIME=Aug 24 20:09 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18573  OWNER=roberto MODE=100600
SIZE=28326193 MTIME=Aug 25 20:09 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=18575  OWNER=roberto MODE=100600
SIZE=18684173 MTIME=Aug 24 20:08 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=19204  OWNER=roberto MODE=100600
SIZE=13771800 MTIME=Aug 26 08:05 2001 
RECONNECT? no

CLEAR? no

UNREF FILE  I=19353  OWNER=roberto MODE=100600
SIZE=18679309 MTIME=Aug 25 20:08 2001 
RECONNECT? no

CLEAR? no

** Phase 5 - Check Cyl groups
10223 files, 446324 used, 74595 free (1019 frags, 9197 blocks, 0.2% fragmentation)

fsdb (inum: 2) inode 19353
current inode: regular file
I=19353 MODE=100600 SIZE=18679309
MTIME=Aug 25 20:08:18 2001 [0 nsec]
CTIME=Aug 25 20:08:18 2001 [0 nsec]
ATIME=Aug 25 20:08:11 2001 [0 nsec]
OWNER=roberto GRP=staff LINKCNT=1 FLAGS=0 BLKCNT=8ec0 GEN=4c2a6c10

-- 
Ollivier ROBERT  -=-  Eurocontrol EEC/ITM  -=-  [EMAIL PROTECTED]
FreeBSD caerdonn.eurocontrol.fr 5.0-CURRENT #46: Wed Jan  3 15:52:00 CET 2001

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: fsck setting d_ino == 0 (was Re: filesystem errors)

2001-08-24 Thread Kirk McKusick

To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED],
   Ollivier Robert [EMAIL PROTECTED],
   Mikhail Teterin [EMAIL PROTECTED]
Subject: fsck setting d_ino == 0 (was Re: filesystem errors) 
In-Reply-To: Your message of Sat, 28 Jul 2001 12:48:54 PDT.
 [EMAIL PROTECTED] 
Date: Wed, 22 Aug 2001 01:21:03 +0100
From: Ian Dowse [EMAIL PROTECTED]

In message [EMAIL PROTECTED],
Kirk McKusick writes:
FFS will never set a directory ino == 0 at a location other
than the first entry in a directory, but fsck will do so to
get rid of an unwanted entry. The readdir routines know to
skip over an ino == 0 entry no matter where in the directory
it is found, so applications will never see such entries.
It would be a fair amount of work to change fsck to `do the
right thing', as the checking code is given only the current
entry with which to work. I am of the opinion that you
should simply accept that mid-directory block ino == 0 is
acceptable rather than trying to `fix' the problem.

Bleh, well I guess not too surprisingly, there is a case in
ufs_direnter() (ufs_lookup.c) where the kernel does the wrong thing
when a mid-block entry has d_ino == 0. The result can be serious
directory corruption, and the bug has been there since the Lite/2
merge:

# fetch http://www.maths.tcd.ie/~iedowse/FreeBSD/dirbug_img.gz
Receiving dirbug_img.gz (6745 bytes): 100%
6745 bytes transferred in 0.0 seconds (4.69 MBps)
# gunzip dirbug_img.gz
# mdconfig -a -t vnode -f dirbug_img
md0
# fsck_ffs /dev/md0
** /dev/md0
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
20 files, 1 used, 2638 free (14 frags, 328 blocks, 0.5% fragmentation)
# mount /dev/md0 /mnt
# touch /mnt/ff12
# umount /mnt
# fsck_ffs /dev/md0
** /dev/md0
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
DIRECTORY CORRUPTED  I=2  OWNER=root MODE=40755
SIZE=512 MTIME=Aug 21 22:28 2001 
DIR=/

SALVAGE? [yn]

The bug is that when compressing directory blocks, the code trusts
the DIRSIZ() macro to calculate the amount of data to be bcopy'd
when moving a directory entry. If d_ino is zero, DIRSIZ() cannot
be trusted, so random bytes in unused portions of the directory
determine how much gets copied. I think it is very unlikely in
practice for the value returned by DIRSIZ() to be harmful, but fsck
certainly doesn't check it so this bug can be triggered after other
types of corruption have been repaired by fsck.

I just found this while looking for a dirhash bug - the dirhash
code didn't check for d_ino == 0 when compressing directories,
so it would freak when it couldn't find the entry to move. The
patch below should fix both these issues, and it makes it clearer
that DIRSIZ() is not used when d_ino == 0.

Any comments welcome. The patch is a bit larger than it needs to
be, but that directory compression code is so hard to understand
that I think it is worth clarifying it slightly :-)

Ian

The compaction code started out deeply nested and highly
conditional. I was very happy to get it down to one for loop
with single nested conditionals. That being said, it is still
pretty hard to follow. Anyway, I agree with your change. It is
amazing to me that that bug has been present since the day the
code was written (1983) and has not been noticed until now.

Kirk


Index: ufs_lookup.c
===
RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v
retrieving revision 1.52
diff -u -r1.52 ufs_lookup.c
--- ufs_lookup.c2001/08/18 03:08:48 1.52
+++ ufs_lookup.c2001/08/21 23:59:09
@@ -869,26 +869,38 @@
 * dp-i_offset + dp-i_count would yield the space.
 */
ep = (struct direct *)dirbuf;
-   dsize = DIRSIZ(OFSFMT(dvp), ep);
+   dsize = ep-d_ino ? DIRSIZ(OFSFMT(dvp), ep) : 0;
spacefree = ep-d_reclen - dsize;
for (loc = ep-d_reclen; loc  dp-i_count; ) {
nep = (struct direct *)(dirbuf + loc);
-   if (ep-d_ino) {
-   /* trim the existing

Re: filesystem errors

2001-07-28 Thread Kirk McKusick

To: Michael Harnois [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: filesystem errors 
In-Reply-To: Your message of Wed, 25 Jul 2001 23:14:16 CDT.
 [EMAIL PROTECTED] 
Date: Thu, 26 Jul 2001 15:14:09 +0100
From: Ian Dowse [EMAIL PROTECTED]

In message [EMAIL PROTECTED],
Michael Harnois writes:

I'm tearing my hair out trying to find a filesystem error that's
causing me a panic: ufsdirhash_checkblock: bad dir inode.

When I run fsck from a single user boot, it finds no errors.

When I run it on the same filesystem mounted, it finds errors: but, of
course, it then can't correct them

[Kirk, I'm cc'ing you because here the dirhash code sanity checks
found a directory entry with d_ino == 0 that was not at the start
of a DIRBLKSIZ block. This doesn't happen normally, but it seems
from this report that fsck does not correct this. Is it a basic
filesystem assumption that d_ino == 0 can only happen at the start
of a directory block, or is it something the code should tolerate?]

FFS will never set a directory ino == 0 at a location other
than the first entry in a directory, but fsck will do so to
get rid of an unwanted entry. The readdir routines know to
skip over an ino == 0 entry no matter where in the directory
it is found, so applications will never see such entries.
It would be a fair amount of work to change fsck to `do the
right thing', as the checking code is given only the current
entry with which to work. I am of the opinion that you
should simply accept that mid-directory block ino == 0 is
acceptable rather than trying to `fix' the problem.

Interesting - this is an error reported by the UFS_DIRHASH code
that you enabled in your kernel config. A sanity check that the
dirhash code is performing is failing. These checks are designed
to catch bugs in the dirhash code, but in this case I think it may
be a bug that fsck is not finding this problem, or else my sanity
tests are too strict.

A workaround is to turn off the sanity checks with:

sysctl vfs.ufs.dirhash_docheck=0

or to remove UFS_DIRHASH from your kernel config. You could also
try to find the directory that is causing the problems. Copy the
following script to a file called dircheck.pl, and try running:

chmod 755 dircheck.pl
find / -fstype ufs -type d -print0 | xargs ./dircheck.pl

That should show up any directories that would fail that dirhash
sanity check - there will probably just be one or two that resulted
from some old filesystem corruption.

Ian


#!/usr/local/bin/perl

while (defined($dir = shift)) {
unless (open(DIR, $dir)) {
print STDERR $dir: $!\n;
next;
}

$b = 0;
my(%dir) = ();

while (sysread(DIR, $dat, 512) == 512) {
$off = 0;
while (length($dat)  0) {
($dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'},
$dir{'d_namlen'}) = unpack(LSCC, $dat);
$dir{'d_name'} = substr($dat, 8, $dir{'d_namlen'});
$minreclen = (8 + $dir{'d_namlen'} + 1 + 3)  (~3);
$gapinfo = ($dir{'d_reclen'} == $minreclen) ?  :
sprintf([%d], $dir{'d_reclen'} - $minreclen);

if ($dir{'d_ino'} == 0  $off != 0) {
printf(%s off %d ino %d reclen 0x%x type 0%o
.  namelen %d name '%s' %s\n,
$dir, $off, $dir{'d_ino'},
$dir{'d_reclen'}, $dir{'d_type'},
$dir{'d_namlen'}, $dir{'d_name'},
$gapinfo);
}
if ($dir{'d_reclen'}  length($dat)) {
die reclen too long!\n;
}
$dat = substr($dat, $dir{'d_reclen'});
$off += $dir{'d_reclen'};
}
$b++;
}
}

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: worklist_remove panic

2001-06-13 Thread Kirk McKusick

I have checked in revision 1.99 to ffs_softdep.c which 
builds on the change in revision 1.98 by [EMAIL PROTECTED]
The symptom being treated in 1.98 was to avoid freeing a
pagedep dependency if there was still a newdirblk dependency
referencing it. That change is correct and no longer prints
the warning message ``handle_written_filepage: active pagedep''
when it occurs. The other part of revision 1.98 was to panic
with ``deallocate_dependencies: active pagedep'' when a
newdirblk dependency was encountered during a file truncation.
This fix removes that panic and replaces it with code to find
and delete the newdirblk dependency so that the truncation can
succeed. This delta should clear up the recent problems that
folks have been having with soft updates.

Kirk McKusick

=-=-=-=-=

To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: worklist_remove panic
From: Dag-Erling Smorgrav [EMAIL PROTECTED]
Date: 26 May 2001 21:25:32 +0200

No dump (dumps seem to have been broken for about a month now), but a
stacktrace from DDB:

kernel: type 12 trap, code=0
Stopped at  worklist_remove+0x1c:   cmpw$0,0xa(%ecx)
db trace
worklist_remove(deadc0de) at worklist_remove+0x1c
free_diradd(deadc0de) at free_diradd+0x26
free_newdirblk(c2e45cd0) at free_newdirblk+0x32
handle_written_inodeblock(c287b200,c6323480) at handle_written_inodeblock+0x2b2
softdep_disk_write_complete(c6323480) at softdep_disk_write_complete+0x6a
bufdone(c6323480,cf2c7f54,c014de93,c6323480,c258b280) at bufdone+0x101
bufdonebio(c6323480) at bufdonebio+0xe
ad_interrupt(c2c5f940,c2564300,cf2c7f7c,c01ba6e4,c258b280) at ad_interrupt+0x3ef
ata_intr(c258b280) at ata_intr+0xae
ithread_loop(c258b200,cf2c7fa8) at ithread_loop+0x424
fork_exit(c01ba2c0,c258b200,cf2c7fa8) at fork_exit+0xf4
fork_trampoline() at fork_trampoline+0x8
db panic
panic: from debugger
Debugger(panic)
Stopped at  worklist_remove+0x1c:   cmpw$0,0xa(%ecx)
db 
panic: from debugger
Uptime: 1d0h12m13s

dumping to dev ad0b, offset 131104
dump ata0: resetting devices .. panic: witness_restore: lock (sleep mutex) Giant not 
locked
Uptime: 1d0h12m13s
Dump already in progress, bailing...
Automatic reboot in 15 seconds - press a key on the console to abort


des@des ~% gdb -k
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-unknown-freebsd.
(kgdb) exec-file /boot/kernel/kernel
(kgdb) symbol-file /sys/compile/DES/kernel.debug
Reading symbols from /sys/compile/DES/kernel.debug...done.
(kgdb) l *(worklist_remove+0x1c)
0xc0261750 is in worklist_remove (../../ufs/ffs/ffs_softdep.c:432).
427 struct worklist *item;
428 {
429
430 if (lk.lkt_held == -1)
431 panic(worklist_remove: lock not held);
432 if ((item-wk_state  ONWORKLIST) == 0) {
433 FREE_LOCK(lk);
434 panic(worklist_remove: not on list);
435 }
436 item-wk_state = ~ONWORKLIST;
(kgdb) l *(free_diradd+0x26)
0xc02640fa is in free_diradd (../../ufs/ffs/ffs_softdep.c:2601).
2596#ifdef DEBUG
2597if (lk.lkt_held == -1)
2598panic(free_diradd: lock not held);
2599#endif
2600WORKLIST_REMOVE(dap-da_list);
2601LIST_REMOVE(dap, da_pdlist);
2602if ((dap-da_state  DIRCHG) == 0) {
2603pagedep = dap-da_pagedep;
2604} else {
2605dirrem = dap-da_previous;
(kgdb) l *(free_newdirblk+0x32)
0xc026345e is in free_newdirblk (../../ufs/ffs/ffs_softdep.c:2033).
2028 */
2029pagedep = newdirblk-db_pagedep;
2030pagedep-pd_state = ~NEWBLOCK;
2031if ((pagedep-pd_state  ONWORKLIST) == 0)
2032while ((dap = LIST_FIRST(pagedep-pd_pendinghd)) != NULL)
2033free_diradd(dap);
2034/*
2035 * If no dependencies remain, the pagedep will be freed.
2036 */
2037for (i = 0; i  DAHASHSZ; i++)

After this panic, fsck complained of bad superblocks on all file
systems.

By the way, fsck is intolerably slow these days: more than twenty
minutes for 'fsck -y' of a 5.5 GB filesystem (roughly 380,000 files)
on a recent and far from sluggish IBM IDE drive.  Most (nearly all) of
that time is spent in phase 2.

DES
-- 
Dag-Erling Smorgrav - [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: fsdb broken in -current

2001-04-23 Thread Kirk McKusick

To: Kirk McKusick [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: fsdb broken in -current
Date: Mon, 23 Apr 2001 22:23:48 +0100
From: Ian Dowse [EMAIL PROTECTED]

The last set of changes to fsck_ffs moved the initialisation of
dev_bsize to sblock_init(), but this is not called by fsdb(8) so
fsdb dies almost immediately with a floating exception. I'm just
going to commit the obvious fix, which is to have fsdb call
sblock_init() also.

Ian

Right you are. Sorry I missed that. It did not occur to me to
verify fsdb.

Kirk

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: vm page panic

2001-03-25 Thread Kirk McKusick

Date: Sun, 25 Mar 2001 11:20:17 +0200
From: Jeroen Ruigrok/Asmodai [EMAIL PROTECTED]
To: Kirk McKusick [EMAIL PROTECTED],
   Peter Wemm [EMAIL PROTECTED],
   Paul Saab [EMAIL PROTECTED], Matt Dillon [EMAIL PROTECTED],
   Soeren Schmidt [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: vm page panic

Hi guys,

ok, sources cvsupped yesterday afternoon, just before my ffs_alloc.c
commit [which I did, obviously, add myself locally].

Box had been running for a while when all of a sudden it got into a
panic:

vm_page_alloc: free/cache page 0xc0776fa4 was dirty

a trace in ddb shows:

allocbuf()
getblk()
ffs_balloc()
ffs_write()
vn_rdwr()
elf_coredump()
coredump()

Unfortunately my ata controller didn't get reprobed [just was hanging
there] so I couldn't get a crashdump. =( [HPT366]

So consider this a heads-up, since you might encounter this.

Extra info:

devfs running,
/ is normal FFS
/tmp, /var, /usr, /storage all soft-updated.

-- 
Jeroen Ruigrok van der Werven/Asmodai .oUo. asmodai@[wxs.nl|freebsd.org]
Documentation nutter/C-rated Coder BSD: Technical excellence at its best  
  D78D D0AD 244D 1D12 C9CA  7152 035C 1138 546A B867
Pleasure's a sin, and sometimes sin's a pleasure...

The latest round of changes to ffs_alloc.c add code which is only ever
used by background fsck which is not yetbeing used. So, it seems
very unlikely that your panic has been triggered byv these changes.

Kirk

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Bad commit?

2000-11-15 Thread Kirk McKusick

Date: Wed, 15 Nov 2000 11:47:07 -0700
From: Warner Losh [EMAIL PROTECTED]
Subject: Bad commit?
Sender: [EMAIL PROTECTED]

As near as I can tell on my laptop, the following change causes panics
with kernel page faults.  With it, my laptop panics every time on
boot (although in slightly different places for my two different
kernels) and without it I'm rock solid.

Has anybody else seen this?

Warner

mckusick2000/11/14 12:46:02 PST

  Modified files:
sys/sys  rman.h 
sys/kern subr_bus.c subr_rman.c 
  Log:
  In preparation for deprecating CIRCLEQ macros in favor of TAILQ
  macros which provide the same functionality and are a bit more
  efficient, convert use of CIRCLEQ's in resource manager to TAILQ's.
  
  Approved by:  Garrett Wollman [EMAIL PROTECTED]
  
  Revision  ChangesPath
  1.13  +3 -3  src/sys/sys/rman.h
  1.83  +3 -5  src/sys/kern/subr_bus.c
  1.14  +30 -35src/sys/kern/subr_rman.c

I have checked in revision 1.15 for subr_rman.c which should fix
the problems being experienced with version 1.14. If you continue
to experience problems with version 1.15, please let me know.

Kirk McKusick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Panic: bqrelse: multiple refs

2000-07-25 Thread Kirk McKusick

Date: Tue, 25 Jul 2000 11:47:03 -0400 (EDT)
From: Brian Fundakowski Feldman [EMAIL PROTECTED]
To: Ollivier Robert [EMAIL PROTECTED]
Cc: "FreeBSD Current Users' list" [EMAIL PROTECTED],
[EMAIL PROTECTED]
Subject: Panic: lockmgr: pid 5, not exclusive lock holder 0 unlocking
In-Reply-To: [EMAIL PROTECTED]

On Tue, 25 Jul 2000, Ollivier Robert wrote:

 According to Brian Fundakowski Feldman:
  Actually, I'm pretty certain this is the fix:
 
 Well it won't panic but isn't it putting the problem under the carpet?
 I agree the panic seems to be here temporarely but...

No, I'm really certain this isn't the case.  You see, struct buf has
a b_lock that until recently was a plain, exclusive lockmgr lock.  In
Kirk's last round of changes, he converted b_lock to be LK_CANRECURSE,
which means that the lock, while still an exclusive lock, may be
relocked multiple times by the same caller.

The panics are plain wrong.  What's left is to determine what is the
proper thing to do in each of these cases, which I'm certain that many
people already know already (you see, I'm still a bit green ;). What I
am _almost_ sure about is that the right thing is just to remove one
of the locks and let it get freed back up the call chain.  I'm almost
certain this is the case because if you are grabbing exclusive locks
and recursing upon them, your call chain is the only consumer and in
a recursive-locking-callchain, you will have multiple symmetric lock
and unlock pairs.  Anything else horribly complicates things, and this
makes me a good 95% certain that this is exactly the right fix, not
that it's sweeping any true bugs under the carpet.

Allowing recursive locks is pretty much the only way to solve many of
the problems here because it's simply not possible to support all code
paths without allowing for this recursion.  The code would either be
horribly complicated or non-functional.  I'm certain Kirk may be able
to back me up here.  It seems that the cleanup is meant to make the
locks recursive mostly to facilitate correct/proper call chains, and
that's consistent with my understand at least :)

Indeed, if you look at the comment in brelse() from the delta, you
will see that the intention of allowing this very situation to occur
and simply BUF_UNLOCK() was planned for and the panic()s were for
debugging during the previous time that b_locks weren't LK_CANRECURSE.

As always, take what I say with a grain of salt since I'm definitely
not a VFS guru in any manner; I just happen to think I understand this
one :)

 -- 
 Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- [EMAIL PROTECTED]
 The Postman hits! The Postman hits! You have new mail.

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'

The above explanation is correct. When I made the change to allow
recursive buffer locks, I should have removed that panic (but forgot
that I had put it in there, sigh). I have just made the change on
freefall. Sorry for the problems caused by that change.

    Kirk McKusick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c

2000-06-24 Thread Kirk McKusick

Date: Thu, 22 Jun 2000 11:54:26 +0200
From: Adrian Chadd [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h
ffs_softdep.c

On Thu, Jun 22, 2000, Brad Knowles wrote:
 At 10:30 AM +0200 2000/6/22, Adrian Chadd wrote:
 
  I like this. Would anyone object if this was brought over
  from NetBSD ?
 
   If you're asking for a vote, you've got mine.

Hmm, Kirk has valid points for leaving a softupdates
filesystem identified by tunefs and not a mount option.

Kirk, do you still want to keep things that way ?

Adrian

Yes, I do want it kept as a yunefs option.


Date: Thu, 22 Jun 2000 13:31:29 +0200
From: Stefan Esser [EMAIL PROTECTED]
To: Adrian Chadd [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED],
Stefan Esser [EMAIL PROTECTED]
Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h
ffs_softdep.c

...

I do remember the discussion that lead to the requirement
to enable soft-updates with tunefs -n.

But I do not remember, why the soft-updates state could
not be just set in the local copy of the super-block and
flushed to disk when the file system is marked dirty ?

Just before a clean file system is to be mounted R/W, it
is obviously safe to modify the soft-updates state.

The file system must have been cleaned before, or the R/W
mount will not be possible (extra logic can prevent the
modification of the MNT_SOFTDEP bit if a mount of a non-clean
partition is forced, in order to preserve the soft-updates
state for the next fsck run).

If the kernel was compiled without soft-updates, it may be
the right thing to keep MNT_SOFTDEP cleared, to not mislead
FSCK ...

Did I miss something obvious ?

Regards, STefan

Your above proposal would work, though that is not how NetBSD
implemented it. I feel that it is a lot of extra mechanism for very
little gain. Administrators generally make a one-time decision to
run soft updates on a filesystem. It is not the sort of thing that
they want to change on a regular basis. It is possible to run tunefs
on a filesystem that is mounted read-only, so it no more difficult
to use tunefs than it is to make it a mount-time option (i.e., they
still have to down-grade to read-only, set the option, then upgrade).
Finally, I expect that soft updates will eventually just be defaulted
to `on' when a filesystem is built, and in a few rare instances an
administrator will want to turn it off. I do not want to have an
option that needs to be added to nearly every fstab entry to get
the default behavior. Plus it is just one more bit of trivia that
new system administrators need to learn to make their systems run
well. The more of those details that need not be learned because
they just do the right thing, the better.

Kirk McKusick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Panic with userquota(softupdates?)

2000-06-18 Thread Kirk McKusick

From: Kevin Day [EMAIL PROTECTED]
Subject: Panic with userquota(softupdates?)
To: [EMAIL PROTECTED]
Date: Fri, 16 Jun 2000 18:55:01 -0500 (CDT)
Cc: [EMAIL PROTECTED]

I keep getting panics in dqget(ufs_quota.c), with a -current
from a couple of days ago. I think this might be softupdates
related, since I can't make it happen with softupdates
turned off, although it's quite possible that it has nothing
to do with it. Does anyone have any idea what might be
causing this?

Any other information that might be useful here?

-- Kevin

I have just committed a change to sys/contrib/softupdates/ffs_softdep.c
(delta 1.68) which corrects a panic in the kernel when quotas and soft
updates are used together. While the specific problem that I fixed
appears somewhat different than the one you are reporting, it may be
related. I suggest that you update to the above delta and see if it
solves your problem. If your problem persists, let me know. As always,
if you can give a specific set of inputs which trigger the problem,
that is always helpful in tracking it down.

Kirk McKusick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: bioops

2000-06-16 Thread Kirk McKusick

To: [EMAIL PROTECTED]
Subject: HEADSUP: bioops patch.
From: Poul-Henning Kamp [EMAIL PROTECTED]
Date: Wed, 14 Jun 2000 22:29:32 +0200

This patch virtualizes  untangles the bioops operations vector.

Background:

The bioops operation vector is a list of OO-like operations which can
be performed on struct buf.  They are used by the softupdates code
to handle dependencies.

Ideally struct buf should have had a real OO like operations vector
like vnodes have it, and struct bioops is the first step towards that.

One of the reasons we should have OO-like struct buf, is that as
long as bwrite(bp) "knows" that the buffer is backed by a disk
device, we cannot use the UFS layer on top of a storage manager
which isn't based on disk-devices:  When UFS modifies a directory
inode, it will call bwrite(bp) on the buffer with the data.  This
would not work if the backing were based on malloc(9) or anonymous
swap-backed VM objects for instance.

In other words: this is the main reason why we don't have a decent
tmpfs in FreeBSD.

Instead of just assuming that it works on a disk, bwrite(bp) should
do a "bp-b_ops-bwrite(bp)" so that each buffer could have its own
private idea of how to write itself, depending on what backing it has.

So in order to move bioops closer to become a bp-b_ops, this patch
takes two entries out of bioops:  the "sync" and the "fsync" items
and virtualizes the rest of the elements a bit.

The "sync" item is called only from the syncer, and is a call to the
softupdates code to do what it needs to do for periodic syncing.

The real way of doing that would be to have an event-handler for this
since other code could need to be part of the sync trafic, raid5
private parity caches could be one example.  I have not done this
yet, since currently softupdates is the only client.

The fsync item really doesn't belong in the fsync system call, it
belongs in ffs_fsync, and has been moved there.

If it had been possible to put the fsync call in ffs_fsync, I would
have done that. Unfortunately, it is not possible and will hang or
panic the kernel if you put it there. The problem is that ffs_fsync
syncs out the data blocks of the associated file. The bioops call
to soft updates requests that any names associated with the file
being sync'ed be sync'ed to disk as well. That is a necessary
semantic of the system call fsync. However, it is not needed by
most clients of VOP_FSYNC. Because the sync'ing of the name requires
a walk up the filesystem tree from the inode in question to the
root of the filesystem, the locking protocol requires that the
nodes lower in the tree be unlocked before locking nodes that are
higher.  This means that the vnode being fsync'ed must be briefly
unlocked while its containing parent is locked. If the vnode being
fsync'ed is a directory, this creates a window where another process
can sneak in and make changes which leads to the panics, two entries
with the same name, etc. This window is not a problem for the fsync
call because it is not creating a new name, but it is a problem if
VOP_FSYNC is called in the open, link, mkdir, etc paths (as it will
be in for example if a block allocation fails due to the filesystem
being full. Thus there are two choices: put the code back as it was
or chance the VOP_FSYNC call interface to add a flags value that
indicates whether the name needs to be syned out as well as the
data. I chose the former as I did not want to disrupt a widely
used interface.

To give the right behaviour when SOFTUPDATES is not compiled in,
stubs for both of these functions have been added to ffs_softdep_stub.c

Finally all the checks to see if the bioops vector is populated
has been centralized in in-line functions in sys/buf.h thereby
paving the road for the global bioops to become bp-b_ops.

Comments, reviews, tests please

Poul-Henning

Index: contrib/softupdates/ffs_softdep.c
===
RCS file: /home/ncvs/src/sys/contrib/softupdates/ffs_softdep.c,v
retrieving revision 1.64
diff -u -r1.64 ffs_softdep.c
--- contrib/softupdates/ffs_softdep.c   2000/05/26 02:01:59 1.64
+++ contrib/softupdates/ffs_softdep.c   2000/06/14 19:26:46
@@ -222,8 +222,6 @@
softdep_disk_io_initiation, /* io_start */
softdep_disk_write_complete,/* io_complete */
softdep_deallocate_dependencies,/* io_deallocate */
-   softdep_fsync,  /* io_fsync */
-   softdep_process_worklist,   /* io_sync */
softdep_move_dependencies,  /* io_movedeps */
softdep_count_dependencies, /* io_countdeps */
 

Re: Mounting one FS on more than one system

1999-12-04 Thread Kirk McKusick

To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Mounting one FS on more than one system
In-reply-to: Your message of "Sat, 04 Dec 1999 10:10:20 MST."
 [EMAIL PROTECTED] 
Date: Sat, 04 Dec 1999 13:53:26 -0800
From: Mike Smith [EMAIL PROTECTED]

(moved to -current where there are more eyes that are interested)

 P.S. Mike, at comdex I spoke to you some about clustering two 
 computers and one RAID array, remember? You mentioned that 
 someone had pursued that avenue some, perhaps not to a working 
 solution, but I don't remember who. Can you (or anyone else) point 
 me to the guilty parties? We would like to pick up the work and run 
 with it for a while.

Yup, I remember.  I also remember going through my stack
of business cards wondering whether I remembered to get
one from you; obviously not.  8)

The sticking issue that we discussed was allowing more than
one system to mount a given filesystem; I seemed to recall
that Kirk has spoken about this before, and there may be
some folks here (or Kirk himself, also copiedd) who may
have some more input on the topic.  Once this is resolved,
everything else is (relatively!) straightforward...

-- 
\\ Give a man a fish, and you feed him for a day. \\  Mike Smith
\\ Tell him he should learn how to fish himself,  \\  [EMAIL PROTECTED]
\\ and he'll hate you for a lifetime. \\  [EMAIL PROTECTED]

Mounting on more than one system is generally problematical unless
you are willing to have all systems read-only. The problem is cache
coherence between the machines. If one changes a block, the other
machines will not see it. Basically, this is why we have the NFS
filesystem. That lets a disk be mounted on one machine, but shared
out to others. If you wanted to write a protocol that would allow
for multiple machines, then you would need to have some central
coordinator running some sort of coherency protocol with a complexity
akin to that of NFS.

    Kirk McKusick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Mounting one FS on more than one system

1999-12-04 Thread Kirk McKusick

To: Kirk McKusick [EMAIL PROTECTED]
cc: Mike Smith [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED]
Subject: Re: Mounting one FS on more than one system 
In-reply-to: Your message of "Sat, 04 Dec 1999 12:44:43 PST."
 [EMAIL PROTECTED] 
Date: Sun, 05 Dec 1999 09:44:32 +1000
From: George Michaelson [EMAIL PROTECTED]

Let me start by saying that I do not have the definitive answers
to all your questions. I'll take a crack at some of them.

Allowing for cache writeback delays, is the speed of
direct-to-shared-disk fast enough that using NFS as an
"abstraction" layer would be faster than any network extant?

The gains come from being able to read data directly from the
disk rather than transferring it across the network. However,
the cost of maintaining cache coherency would be at least as
difficult and bandwidth consuming as a distributed filesystem.

Would it be as fast? would the effort to make this work
exceed the cost of making real networks exist?

As the network speed approaches the disk speed, the gains
would diminish.

It would seem that there might be opportunities to do 'cut
through' in the coding for known-private files after open
(ok, inode allocation/extension has problems) to optimize
them to at-worst 'disk+bits' instead of NFS costs.

The problem is in identifying when private goes to shared. Also
as you point out, new block and inode allocations have to be
centrally coordinated.

If one party mounts -r the FS (eg news spool) then does
this reduce the complexity? eg /usr mounted read-mostly
for a bunch of tightly coupled boxes.

If any machine can write, then all the other machines have to
have some way of keeping their caches consistent with the
machine that did the modification.

If some other protocol is used for interlock, does this
make mmap shares across clusters faster?

Mmap sharing across machines is going to be slow. I have never
been a fan of distributed shared memory as a programming model,
and this does not look like a way of making it run any faster.

-George

--
George Michaelson |  DSTC Pty Ltd
Email: [EMAIL PROTECTED]|  University of Qld 4072
Phone: +61 7 3365 4310|  Australia
  Fax: +61 7 3365 4311|  http://www.dstc.edu.au


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Disappearing/Reappearing Files... (fwd)

1999-05-09 Thread Kirk McKusick
The changes that I added to soft updates two days ago only kick in
when the soft dependency memory limit is hit. This certainly should
not be happening at system startup, and on any machine with more
than 64Mb of memory, almost never. I did make a couple of minor
textual changes to other parts of the code which should not have
had any effect, but just in case they did, I have put them back to
their previous state in today's delta. I would appreciate your
trying out the current delta (1.27) and seeing if the problem
persists. If it does, please try out the version before I did my
recent rework (1.24). If that version has the problem as well, then
I believe that some other change is triggering the problem, as 1.24
represents a version that has been in production for half a year
without trouble.

Kirk McKusick

=-=-=-=-=-=-=-=

Date: Sun, 9 May 1999 01:39:33 -0700 (PDT)
From: Julian Elischer jul...@whistle.com
To: mckus...@mckusick.com
Subject: Re: Disappearing/Reappearing Files... (fwd)

FYI

also some other people are commenting on odd behaviour where a created
file doesn't show up for a while...
almost as if the readdir() is returning the 'backed out' version of the
directory data.

julian


-- Forwarded message --
Date: Sat, 8 May 1999 14:11:10 -0700 (PDT)
From: John Polstra j...@polstra.com
To: curr...@freebsd.org
Subject: Re: Disappearing/Reappearing Files...

In article 199905082048.naa34...@vashon.polstra.com,
John Polstra  j...@polstra.com wrote:
 
 I'm seeing something possibly related (possibly not) on an Alpha with
 this morning's -current.  First I was getting unaligned accesses and
 core dumps from the cp in /etc/rc that updates the /etc/motd file.
 (I added set -v to /etc/rc to catch it.)  But I could do the copy by
 hand once the system was up.  Now on the latest reboot I got this from
 it:
 
 + cp /tmp/_motd /etc/motd
 + chmod 644 /etc/motd
 chmod: : No such file or directory
 chmod in free(): warning: recursive call
 chmod in free(): warning: recursive call
 chmod in free(): warning: recursive call
 chmod in free(): warning: recursive call
 
 (Hmm, why didn't the filename come out in chmod's error message?)
 
 I'm running with soft-updates but I'll try turning them off.

I tried about 10 reboots, half with and half without soft-updates
enabled on the various filesystems.  With soft-updates disabled, I
didn't see the above problem at all.  With soft-updates enabled, I saw
it most of the time but not always.

John
-- 
  John Polstra   j...@polstra.com
  John D. Polstra  Co., Inc.Seattle, Washington USA
  Self-interest is the aphrodisiac of belief.   -- James V. DeLong


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message



Re: softupdate panic, anyone seen this? (fwd)

1999-02-19 Thread Kirk McKusick
Jeffrey Hsu and I just came to the same conclusion about the splbio
additions earlier this week. I had assumed that Jeffrey had put in
these changes already. Anyway, the two of you need to coordinate
getting the changes put in so that you do not collide.

Kirk McKusick

=-=-=-=-=-=-=-=

Date: Fri, 19 Feb 1999 02:43:20 -0800 (PST)
From: Matthew Dillon dil...@apollo.backplane.com
To: Julian Elischer jul...@whistle.com
Cc: Kirk McKusick mckus...@mckusick.com, Jake j...@checker.org,
Don Lewis don.le...@tsc.tdk.com, curr...@freebsd.org
Subject: Re: softupdate panic, anyone seen this? (fwd) 
References:  pine.bsf.3.95.990218132053.16922a-100...@current1.whistle.com

This may or may not be related.  In tracking down the sched_sync() 
panic I found two bugs.  

First, a couple of places where the worklist was not being protected
at splbio().  I'm not 100% sure that this is a problem but the code
is complex enough that it's just too dangerous not to do it.

Second, a double LIST_REMOVE() was being performed in the case where
VOP_FSYNC() would fail to sync all the dirty pages.  This can occur
legally for both NFS and filesystems with SOFTUPDATES set.

I'd appreciate it if someone could verify the double LIST_REMOVE()
bug.  vn_syncer_add_to_worklist() already removes the vn from
the list ( assuming the VONWORKLIST v_flag is set, which it should be
in this case ).

-Matt
Matthew Dillon 
dil...@backplane.com

Index: kern/vfs_subr.c
===
RCS file: /home/ncvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.186
diff -u -r1.186 vfs_subr.c
--- vfs_subr.c  1999/02/04 18:25:39 1.186
+++ vfs_subr.c  1999/02/19 10:40:17
@@ -881,10 +881,8 @@
 /*
  * Add an item to the syncer work queue.
  */
-void
-vn_syncer_add_to_worklist(vp, delay)
-   struct vnode *vp;
-   int delay;
+static void
+vn_syncer_add_to_worklist(struct vnode *vp, int delay)
 {
int s, slot;
 
@@ -928,7 +926,8 @@
starttime = time_second;
 
/*
-* Push files whose dirty time has expired.
+* Push files whose dirty time has expired.  Be careful
+* of interrupt race on slp queue.
 */
s = splbio();
slp = syncer_workitem_pending[syncer_delayno];
@@ -941,16 +940,20 @@
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, p);
(void) VOP_FSYNC(vp, p-p_ucred, MNT_LAZY, p);
VOP_UNLOCK(vp, 0, p);
+   s = splbio();
if (LIST_FIRST(slp) == vp) {
if (TAILQ_EMPTY(vp-v_dirtyblkhd) 
vp-v_type != VBLK)
-   panic(sched_sync: fsync failed);
+   panic(sched_sync: fsync failed vp %p 
type %d tag %d, vp, vp-v_type, vp-v_tag);
/*
 * Move ourselves to the back of the sync list.
+* vn_syncer_*worklist() will remove and re-add
+* the node.
 */
-   LIST_REMOVE(vp, v_synclist);
+   /*LIST_REMOVE(vp, v_synclist);*/
vn_syncer_add_to_worklist(vp, syncdelay);
}
+   splx(s);
}
 
/*
@@ -2841,6 +2844,8 @@
 
 /*
  * The syncer vnode is no longer needed and is being decommissioned.
+ *
+ * Modifications to the worklist must be protected at splbio().
  */
 static int
 sync_reclaim(ap)
@@ -2849,12 +2854,15 @@
} */ *ap;
 {
struct vnode *vp = ap-a_vp;
+   int s;
 
+   s = splbio();
vp-v_mount-mnt_syncer = NULL;
if (vp-v_flag  VONWORKLST) {
LIST_REMOVE(vp, v_synclist);
vp-v_flag = ~VONWORKLST;
}
+   splx(s);
 
return (0);
 }


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message



  1   2   >