Hello Walter,

Thanks for this crazy diff.  It is big so my first comment would be to
split it to ease reviews.  Here's a suggestion:
  
  - At least one diff for the kernel parts that are not WAPBL-only. In
    other words the refactoring needed for integrating WAPBL into FFS.
    This is the critical part in your work since people generally do not
    like regression with their filesystem :)  From my point of view the
    addition of B_CONTIG/B_METAONLY could be easily split.  You could
    also produce a diff for the wrapper functions that you're adding...

  - A diff adding the WAPBL backend to the kernel.  This diff would
    depend on the previous.  It would be nice if you could keep all its
    chunks under "#ifdef WAPBL".  Even if such define is not strictly
    needed for compiling the kernel it helps when reading the code.

  - At least one diff for userland tools support.

Splitting your work would also allow you to give a more precise
descriptions about each subpart of this monstrous work :)

Apart from adding WAPBL to OpenBSD, do you have any idea about how to
start using it?  Is there any drawback if I'm sharing a media with
WAPBL enabled on an old OpenBSD installation (not supporting WAPBL)?

Concerning WAPBL itself did you consider porting the wapbl(9) manual
from NetBSD?  Is its BUGS section still relevant?  What about the
Open issues mentioned by Jörg Sonnenberge in its 2009 BSDCan [0] talk,
are they still open?

Keep the good work!
Martin

Some more comments inline.

[0] https://www.bsdcan.org/2009/schedule/attachments/106_wapbl.pdf

> Index: sys/kern/vfs_bio.c
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/kern/vfs_bio.c,v
> retrieving revision 1.170
> diff -u -r1.170 vfs_bio.c
> --- sys/kern/vfs_bio.c        19 Jul 2015 16:21:11 -0000      1.170
> +++ sys/kern/vfs_bio.c        23 Oct 2015 20:48:45 -0000
> @@ -647,6 +672,20 @@
>  {
>       int s;
>  
> +     /* If this is a tape block, write the block now. */
> +     if (major(bp->b_dev) < nblkdev &&
> +         bdevsw[major(bp->b_dev)].d_type == D_TAPE) {
> +             bawrite(bp);
> +             return;
> +     }

If I'm reading NetBSD's log correctly this has been done separately to
avoid calling reassignbuf().  That would be a good candidate for a
single diff :)

> +     if (injournal(bp)) {
> +             struct mount *mp = wapbl_vptomp(bp->b_vp);
> +
> +             if (bp->b_iodone != mp->mnt_wapbl_op->wo_wapbl_biodone)
> +                     WAPBL_ADD_BUF(mp, bp);
> +     }

What I was suggesting before is to keep such blocks under #ifdef WAPBL.


> @@ -743,12 +775,28 @@
>        * Determine which queue the buffer should be on, then put it there.
>        */
>  
> +     /* If it's locked, don't report an error; try again later */
> +     if (ISSET(bp->b_flags, (B_LOCKED|B_ERROR)) == (B_LOCKED|B_ERROR))
> +             CLR(bp->b_flags, B_ERROR);
> +     
>       /* If it's not cacheable, or an error, mark it invalid. */
>       if (ISSET(bp->b_flags, (B_NOCACHE|B_ERROR)))
>               SET(bp->b_flags, B_INVAL);
>  
>       if (ISSET(bp->b_flags, B_INVAL)) {
>               /*
> +              * If using WAPBL
> +              */

I don't know what's your sync-with-upstream strategy.  But by comparing
your diff with NetBSD's sources I feel that the comment below should be
there instead of this one ;)

> +             if (ISSET(bp->b_flags, B_LOCKED)) {
> +                     if (wapbl_vphaswapbl(bp->b_vp)) {
> +                             struct mount *mp = wapbl_vptomp(bp->b_vp);
> +                             KASSERT(bp->b_iodone
> +                                 != mp->mnt_wapbl_op->wo_wapbl_biodone);
> +                             WAPBL_REMOVE_BUF(mp, bp);
> +                     }
> +             }
> +             
> +             /*
>                * If the buffer is invalid, free it now rather than leaving
>                * it in a queue and wasting memory.
>                */
                ^^^^^^^


> @@ -1206,6 +1267,17 @@
>  }
>  #endif
>  
> +void
> +buf_adjcnt(struct buf *bp, long ncount)
> +{
> +     KASSERT(ncount <= bp->b_bufsize);
> +     long ocount = bp->b_bcount;
> +     bp->b_bcount = ncount;
> +     if (injournal(bp))
> +             WAPBL_RESIZE_BUF(wapbl_vptomp(bp->b_vp), bp, bp->b_bufsize,
> +                 ocount);
> +}

This is an example of wrapper function that could be easily extended :)

> Index: sys/kern/vfs_biomem.c
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/kern/vfs_biomem.c,v
> retrieving revision 1.34
> diff -u -r1.34 vfs_biomem.c
> --- sys/kern/vfs_biomem.c     19 Jul 2015 21:21:14 -0000      1.34
> +++ sys/kern/vfs_biomem.c     23 Oct 2015 20:48:45 -0000
> @@ -89,7 +89,7 @@
>  {
>       splassert(IPL_BIO);
>       SET(bp->b_flags, B_BUSY);
> -     if (bp->b_data != NULL) {
> +     if (bp->b_data != NULL && !(bp->b_flags & B_LOCKED)) {

Style nit.  The ISSET() family of macro is almost always used when
manipulating ``bp->b_flags''.  You might want to suggest such changes
to Bitrig :)

> Index: sys/kern/vfs_wapbl.c
> ===================================================================
> RCS file: sys/kern/vfs_wapbl.c
> diff -N sys/kern/vfs_wapbl.c
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ sys/kern/vfs_wapbl.c      23 Oct 2015 20:26:12 -0000
> @@ -0,0 +1,2912 @@
> +/*   $NetBSD: vfs_wapbl.c,v 1.51.2.2 2013/01/02 23:23:15 riz Exp $   */
> +
> +/*-
> + * Copyright (c) 2003, 2008, 2009 The NetBSD Foundation, Inc.
> + * All rights reserved.
> + *
> + * This code is derived from software contributed to The NetBSD Foundation
> + * by Wasabi Systems, Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
> + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
> LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> + * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
> + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/*
> + * This implements file system independent write ahead filesystem logging.
> + */
> +
> +#define WAPBL_INTERNAL
> +
> +#include <sys/cdefs.h>
> +
> +#include <sys/param.h>
> +#include <sys/bitops.h>
> +
> +#ifdef _KERNEL
> +#include <sys/param.h>
> +#include <sys/systm.h>
> +#include <sys/namei.h>
> +#include <sys/proc.h>
> +#include <sys/sysctl.h>
> +#include <sys/uio.h>
> +#include <sys/vnode.h>
> +#include <sys/file.h>
> +#include <sys/resourcevar.h>
> +#include <sys/conf.h>
> +#include <sys/mount.h>
> +#include <sys/kernel.h>
> +#include <sys/pool.h>
> +#include <sys/mutex.h>
> +#include <sys/atomic.h>
> +#include <sys/malloc.h>
> +#include <sys/pool.h>
> +#include <sys/dkio.h>
> +#include <sys/wapbl.h>
> +#include <sys/wapbl_replay.h>

Are you sure all these headers are needed?  <sys/pool.h> twice?

> +#define      wapbl_alloc(s) malloc((s), M_TEMP, M_WAITOK | M_ZERO)
> +#define      wapbl_free(a, s) free((a), M_TEMP, (s))
> +#define      wapbl_calloc(n, s) malloc((n)*(s), M_TEMP, M_WAITOK | M_ZERO)

M_TEMP might be good enough for debugging but I would suggest using a
different type :)

> +/* 
> + * This structure holds per-mount log information.
> + *
> + * Legend:   a = atomic access only
> + *           r = read-only after init
> + *           l = rwlock held
> + *           m = mutex held
> + *           lm = rwlock held writing or mutex held
> + *           u = unlocked access ok
> + *           b = bufcache_lock held
> + */

Do you understand the existing locking strategy?  Do you think it fits 
or is it worth keeping in OpenBSD's kernel?  I'm asking because I find 
a bit confusing the existing mix of locks, mutex, cv and spl in the code 
you're porting.

> Index: sys/sys/bitops.h
> ===================================================================
> RCS file: sys/sys/bitops.h
> diff -N sys/sys/bitops.h
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ sys/sys/bitops.h  23 Oct 2015 20:26:12 -0000
> @@ -0,0 +1,294 @@
> +/*   $NetBSD: bitops.h,v 1.9 2011/07/30 16:35:58 christos Exp $      */
> +
> +/*-
> + * Copyright (c) 2007, 2010 The NetBSD Foundation, Inc.
> + * All rights reserved.
> + *
> + * This code is derived from software contributed to The NetBSD Foundation
> + * by Christos Zoulas and Joerg Sonnenberger.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
> + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
> LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> + * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
> + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + */
> +#ifndef _SYS_BITOPS_H_
> +#define _SYS_BITOPS_H_
> +
> +#include <sys/stdint.h>
> +
> +/*
> + * Find First Set functions
> + */
> +#ifndef ffs32
> +static __inline int __unused
> +ffs32(uint32_t _n)

libkern already has ffs(9), fls(9) and flsl(9) is there anything to merge
here?

> +/*
> + * Integer logarithm, returns -1 on error. Inspired by the linux
> + * version written by David Howells.
> + */
> +#define _ilog2_helper(_n, _x)        ((_n) & (1ULL << (_x))) ? _x :
> +#define ilog2(_n) \

Funny oce(4) defines its own version!

> +static __inline void
> +fast_divide32_prepare(uint32_t _div, uint32_t * __restrict _m,
> +    uint8_t *__restrict _s1, uint8_t *__restrict _s2)

You don't seem to be using these anywhere, or did I miss something?

> Index: sys/sys/mount.h
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/sys/mount.h,v
> retrieving revision 1.121
> diff -u -r1.121 mount.h
> --- sys/sys/mount.h   8 Sep 2014 01:47:06 -0000       1.121
> +++ sys/sys/mount.h   23 Oct 2015 20:48:45 -0000
> @@ -358,6 +358,11 @@
>       int             mnt_maxsymlinklen;      /* max size of short symlink */
>       struct statfs   mnt_stat;               /* cache of filesystem stats */
>       void            *mnt_data;              /* private data */
> +     struct wapbl_ops
> +                     *mnt_wapbl_op;          /* logging ops */
> +     struct wapbl    *mnt_wapbl;             /* log info */
> +     struct wapbl_replay
> +                     *mnt_wapbl_replay;      /* replay support XXX: what? */
>  };
>  
>  /*
> @@ -396,7 +401,7 @@
>  /*
>   * Mask of flags that are visible to statfs()
>   */
> -#define      MNT_VISFLAGMASK 0x0400ffff
> +#define      MNT_VISFLAGMASK 0x1400ffff
>  
>  #define      MNT_BITS \
>      "\010\001RDONLY\002SYNCHRONOUS\003NOEXEC\004NOSUID\005NODEV" \
> @@ -413,6 +418,7 @@
>  #define MNT_WANTRDWR 0x02000000      /* want upgrade to read/write */
>  #define MNT_SOFTDEP     0x04000000      /* soft dependencies being done */
>  #define MNT_DOOMED   0x08000000      /* device behind filesystem is gone */
> +#define MNT_LOG              0x10000000      /* use logging */
>  
>  /*
>   * Flags for various system call interfaces.
> @@ -501,6 +507,51 @@
>  extern int bufbackoff(struct uvm_constraint_range*, long);
>  
>  /*
> + * This operations vector is so wapbl can be wrapped into a filesystem lkm.
> + * XXX Eventually, we want to move this functionality
> + * down into the filesystems themselves so that this isn't needed.
> + */

This is interesting.  OpenBSD does not support lkm, do you think it is
worth keeping this layer of indirection?  For diffability maybe?

> Index: sys/sys/specdev.h
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/sys/specdev.h,v
> retrieving revision 1.34
> diff -u -r1.34 specdev.h
> --- sys/sys/specdev.h 2 Nov 2013 00:16:31 -0000       1.34
> +++ sys/sys/specdev.h 23 Oct 2015 20:48:45 -0000
> @@ -37,6 +37,10 @@
>   * special devices. It is allocated in checkalias and freed
>   * in vgone.
>   */
> +
> +#ifndef _SYS_SPECDEV_H_
> +#define _SYS_SPECDEV_H_
> +
>  struct specinfo {
>       struct  vnode **si_hashchain;
>       struct  vnode *si_specnext;
> @@ -108,3 +112,4 @@
>  int  spec_advlock(void *);
>  
>  #endif       /* _KERNEL */
> +#endif  /* _SYS_SPECDEV_H_ */

This guard could already be committed separately.


> Index: sys/sys/stat.h
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/sys/stat.h,v
> retrieving revision 1.28
> diff -u -r1.28 stat.h
> --- sys/sys/stat.h    4 Apr 2015 18:06:08 -0000       1.28
> +++ sys/sys/stat.h    23 Oct 2015 20:48:45 -0000
> @@ -173,6 +173,7 @@
>  #define      SF_ARCHIVED     0x00010000      /* file is archived */
>  #define      SF_IMMUTABLE    0x00020000      /* file may not be changed */
>  #define      SF_APPEND       0x00040000      /* writes to file may only 
> append */
> +#define SF_LOG               0x00400000      /* WAPBL log file inode */

Please do not mix tab and space :)

> Index: sys/sys/vnode.h
> ===================================================================
> RCS file: /Volumes/CSP/cvs/src/sys/sys/vnode.h,v
> retrieving revision 1.132
> diff -u -r1.132 vnode.h
> --- sys/sys/vnode.h   7 May 2015 08:53:33 -0000       1.132
> +++ sys/sys/vnode.h   23 Oct 2015 20:48:45 -0000
> @@ -185,13 +185,14 @@
>  /*
>   * Flags for ioflag.
>   */
> -#define      IO_UNIT         0x01            /* do I/O as atomic unit */
> -#define      IO_APPEND       0x02            /* append write to end */
> -#define      IO_SYNC         0x04            /* do I/O synchronously */
> -#define      IO_NODELOCKED   0x08            /* underlying node already 
> locked */
> -#define      IO_NDELAY       0x10            /* FNDELAY flag set in file 
> table */
> -#define      IO_NOLIMIT      0x20            /* don't enforce limits on i/o 
> */
> -#define      IO_NOCACHE      0x40            /* don't cache result of this 
> i/o */
> +#define      IO_UNIT                 0x01    /* I/O as atomic unit */
> +#define      IO_APPEND               0x02    /* append write to end */
> +#define      IO_SYNC                 0x04    /* do I/O synchronously */
> +#define      IO_NODELOCKED           0x08    /* underlying node already 
> locked */
> +#define      IO_NDELAY               0x10    /* FNDELAY flag set in file 
> table */
> +#define      IO_NOLIMIT              0x20    /* don't enforce limits on i/o 
> */
> +#define      IO_NOCACHE              0x40    /* don't cache result of this 
> i/o */
> +#define IO_JOURNALLOCKED     0x80    /* journal is already locked */

Same here :)

Reply via email to