Re: WAPBL: Adding the FFS capability to alloc files contiguously
On Sat, Nov 28, 2015 at 03:49:21PM -0700, Bob Beck wrote: > On Fri, Nov 27, 2015 at 02:50:57PM -0200, Walter Neto wrote: > > You have a number of places here where you introduce a line of 8 spaces > after your #endif - please clean up the trailing spaces, they shouldn't be > there. > Ok, cleaned :) > You also have uses of B_METAONLY that are not inside a #ifdef WAPBL in > ffs_balloc.c > Yeah, but I don't know what is the best way to correct it. Should I use: #ifdef WAPBL foo(..., flags | B_METAONLY, ...); #else foo(..., flags, ...); #endif or foo(..., flags #ifdef WAPBL | B_METAONLY #endif , ...) ? (Waiting this answer to send the fixed diff. > The first one I mostly get - as we are only looking for the first indirect > block > this makes sense. the second usage I'm not sure is correct... is it? > Yes Bob, It is correct! After many hours reading and re-reading the FFS code (it is a dragon) I understood it better, and the socond makes sense too cause it is where other data block is been allocated to store new indirect data-blocks address, so it is a B_METAONLY data-block. I don't know if I was clear, any doubts we can discuss. > I would like some more FFS savvy eyes on this one and not just me. > (This is a large hint to some other people) > > -Bob > > > > After mpi@ review > > > > -- > > Walter Neto > > > > diff --git a/sys/sys/buf.h b/sys/sys/buf.h > > index c47f3f9..fd38c28 100644 > > --- a/sys/sys/buf.h > > +++ b/sys/sys/buf.h > > @@ -254,6 +254,8 @@ struct cluster_save { > > /* Flags to low-level allocation routines. */ > > #define B_CLRBUF 0x01/* Request allocated buffer be cleared. */ > > #define B_SYNC 0x02/* Do all allocations synchronously. */ > > +#define B_METAONLY 0x04/* return indirect block buffer */ > > +#define B_CONTIG 0x08/* allocate file contiguously */ > > > > struct cluster_info { > > daddr_t ci_lastr; /* last read (read-ahead) */ > > diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c > > index 08961b9..807a2d1 100644 > > --- a/sys/ufs/ffs/ffs_alloc.c > > +++ b/sys/ufs/ffs/ffs_alloc.c > > @@ -63,16 +63,19 @@ > > (fs)->fs_fsmnt, (cp)); \ > > } while (0) > > > > -daddr_tffs_alloccg(struct inode *, int, daddr_t, int); > > +daddr_tffs_alloccg(struct inode *, int, daddr_t, int, int); > > struct buf * ffs_cgread(struct fs *, struct inode *, int); > > -daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t); > > -daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int); > > +daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t, > > int); > > +daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int, > > int); > > ufsino_t ffs_dirpref(struct inode *); > > daddr_tffs_fragextend(struct inode *, int, daddr_t, int, int); > > -daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, > > - daddr_t (*)(struct inode *, int, daddr_t, int)); > > -daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int); > > +daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, int, > > +daddr_t (*)(struct inode *, int, daddr_t, int, int)); > > +daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int, int); > > daddr_tffs_mapsearch(struct fs *, struct cg *, daddr_t, int); > > +void ffs_blkfree_subr(struct fs *, struct vnode *, > > + struct inode *, daddr_t bno, long size); > > + > > > > int ffs1_reallocblks(void *); > > #ifdef FFS2 > > @@ -106,7 +109,7 @@ static const struct timeval fserr_interval = { 2, 0 > > }; > > * available block is located. > > */ > > int > > -ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, > > +ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, int > > flags, > > struct ucred *cred, daddr_t *bnp) > > { > > static struct timeval fsfull_last; > > @@ -147,7 +150,7 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, > > int size, > > cg = dtog(fs, bpref); > > > > /* Try allocating a block. */ > > - bno = ffs_hashalloc(ip, cg, bpref, size, ffs_alloccg); > > + bno = ffs_hashalloc(ip, cg, bpref, size, flags, ffs_alloccg); > > if (bno > 0) { > > /* allocation successful, update inode data */ > > DIP
Re: WAPBL: Adding the FFS capability to alloc files contiguously
Fixed diff Ok beck@ and tedu@ -- Walter Neto diff --git a/sys/sys/buf.h b/sys/sys/buf.h index c47f3f9..fd38c28 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -254,6 +254,8 @@ struct cluster_save { /* Flags to low-level allocation routines. */ #define B_CLRBUF 0x01/* Request allocated buffer be cleared. */ #define B_SYNC 0x02/* Do all allocations synchronously. */ +#define B_METAONLY 0x04/* return indirect block buffer */ +#define B_CONTIG 0x08/* allocate file contiguously */ struct cluster_info { daddr_t ci_lastr; /* last read (read-ahead) */ diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index 08961b9..f692261 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -63,16 +63,19 @@ (fs)->fs_fsmnt, (cp)); \ } while (0) -daddr_tffs_alloccg(struct inode *, int, daddr_t, int); +daddr_tffs_alloccg(struct inode *, int, daddr_t, int, int); struct buf * ffs_cgread(struct fs *, struct inode *, int); -daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t); -daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int); +daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t, int); +daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int, int); ufsino_t ffs_dirpref(struct inode *); daddr_tffs_fragextend(struct inode *, int, daddr_t, int, int); -daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, - daddr_t (*)(struct inode *, int, daddr_t, int)); -daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int); +daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, int, +daddr_t (*)(struct inode *, int, daddr_t, int, int)); +daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int, int); daddr_tffs_mapsearch(struct fs *, struct cg *, daddr_t, int); +void ffs_blkfree_subr(struct fs *, struct vnode *, + struct inode *, daddr_t bno, long size); + int ffs1_reallocblks(void *); #ifdef FFS2 @@ -106,7 +109,7 @@ static const struct timeval fserr_interval = { 2, 0 }; * available block is located. */ int -ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, +ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, int flags, struct ucred *cred, daddr_t *bnp) { static struct timeval fsfull_last; @@ -147,7 +150,7 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, cg = dtog(fs, bpref); /* Try allocating a block. */ - bno = ffs_hashalloc(ip, cg, bpref, size, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, size, flags, ffs_alloccg); if (bno > 0) { /* allocation successful, update inode data */ DIP_ADD(ip, blocks, btodb(size)); @@ -160,6 +163,14 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, (void) ufs_quota_free_blocks(ip, btodb(size), cred); nospace: +#ifdef WAPBL + if (flags & B_CONTIG) { + /* +* Fail silently -- it's up to our caller to report errors. +*/ + return (ENOSPC); + } +#endif /* WAPBL */ if (ratecheck(_last, _interval)) { ffs_fserr(fs, cred->cr_uid, "file system full"); uprintf("\n%s: write failed, file system is full\n", @@ -178,7 +189,7 @@ nospace: */ int ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, -int nsize, struct ucred *cred, struct buf **bpp, daddr_t *blknop) +int nsize, int flags, struct ucred *cred, struct buf **bpp, daddr_t *blknop) { static struct timeval fsfull_last; struct fs *fs; @@ -295,7 +306,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, panic("ffs_realloccg: bad optim"); /* NOTREACHED */ } - bno = ffs_hashalloc(ip, cg, bpref, request, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, request, flags, ffs_alloccg); if (bno <= 0) goto nospace; @@ -434,7 +445,7 @@ ffs1_reallocblks(void *v) /* * Find the preferred location for the cluster. */ - pref = ffs1_blkpref(ip, start_lbn, soff, sbap); + pref = ffs1_blkpref(ip, start_lbn, soff, 0, sbap); /* * If the block range spans two block maps, get the second map. */ @@ -454,7 +465,7 @@ ffs1_reallocblks(void *v) /* * Search the block map looking for an allocation of the desired size. */ - if ((newblk = ffs_hashalloc(ip, dtog(fs, pref), pref, len, + if ((newblk = ffs_hashalloc(ip, dtog(fs, pref), pref, len, 0, ffs_
Re: WAPBL: Adding the FFS capability to alloc files contiguously
After mpi@ review -- Walter Neto diff --git a/sys/sys/buf.h b/sys/sys/buf.h index c47f3f9..fd38c28 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -254,6 +254,8 @@ struct cluster_save { /* Flags to low-level allocation routines. */ #define B_CLRBUF 0x01/* Request allocated buffer be cleared. */ #define B_SYNC 0x02/* Do all allocations synchronously. */ +#define B_METAONLY 0x04/* return indirect block buffer */ +#define B_CONTIG 0x08/* allocate file contiguously */ struct cluster_info { daddr_t ci_lastr; /* last read (read-ahead) */ diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index 08961b9..807a2d1 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -63,16 +63,19 @@ (fs)->fs_fsmnt, (cp)); \ } while (0) -daddr_tffs_alloccg(struct inode *, int, daddr_t, int); +daddr_tffs_alloccg(struct inode *, int, daddr_t, int, int); struct buf * ffs_cgread(struct fs *, struct inode *, int); -daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t); -daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int); +daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t, int); +daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int, int); ufsino_t ffs_dirpref(struct inode *); daddr_tffs_fragextend(struct inode *, int, daddr_t, int, int); -daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, - daddr_t (*)(struct inode *, int, daddr_t, int)); -daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int); +daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, int, +daddr_t (*)(struct inode *, int, daddr_t, int, int)); +daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int, int); daddr_tffs_mapsearch(struct fs *, struct cg *, daddr_t, int); +void ffs_blkfree_subr(struct fs *, struct vnode *, + struct inode *, daddr_t bno, long size); + int ffs1_reallocblks(void *); #ifdef FFS2 @@ -106,7 +109,7 @@ static const struct timeval fserr_interval = { 2, 0 }; * available block is located. */ int -ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, +ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, int flags, struct ucred *cred, daddr_t *bnp) { static struct timeval fsfull_last; @@ -147,7 +150,7 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, cg = dtog(fs, bpref); /* Try allocating a block. */ - bno = ffs_hashalloc(ip, cg, bpref, size, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, size, flags, ffs_alloccg); if (bno > 0) { /* allocation successful, update inode data */ DIP_ADD(ip, blocks, btodb(size)); @@ -159,6 +162,14 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, /* Restore user's disk quota because allocation failed. */ (void) ufs_quota_free_blocks(ip, btodb(size), cred); +#ifdef WAPBL + if (flags & B_CONTIG) { + /* +* Fail silently -- it's up to our caller to report errors. +*/ + return (ENOSPC); + } +#endif /* WAPBL */ nospace: if (ratecheck(_last, _interval)) { ffs_fserr(fs, cred->cr_uid, "file system full"); @@ -178,7 +189,7 @@ nospace: */ int ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, -int nsize, struct ucred *cred, struct buf **bpp, daddr_t *blknop) +int nsize, int flags, struct ucred *cred, struct buf **bpp, daddr_t *blknop) { static struct timeval fsfull_last; struct fs *fs; @@ -295,7 +306,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, panic("ffs_realloccg: bad optim"); /* NOTREACHED */ } - bno = ffs_hashalloc(ip, cg, bpref, request, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, request, flags, ffs_alloccg); if (bno <= 0) goto nospace; @@ -434,7 +445,7 @@ ffs1_reallocblks(void *v) /* * Find the preferred location for the cluster. */ - pref = ffs1_blkpref(ip, start_lbn, soff, sbap); + pref = ffs1_blkpref(ip, start_lbn, soff, 0, sbap); /* * If the block range spans two block maps, get the second map. */ @@ -454,7 +465,7 @@ ffs1_reallocblks(void *v) /* * Search the block map looking for an allocation of the desired size. */ - if ((newblk = ffs_hashalloc(ip, dtog(fs, pref), pref, len, + if ((newblk = ffs_hashalloc(ip, dtog(fs, pref), pref, len, 0, ffs_clusteralloc)) == 0) goto
Re: WAPBL: Introducing B_LOCKED buffer flag
After mpi@ review -- Walter Neto diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 63bd7ca..29ebc81 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -743,6 +743,12 @@ brelse(struct buf *bp) * Determine which queue the buffer should be on, then put it there. */ +#ifdef WAPBL + /* If it's locked, don't report an error; try again later */ + if (ISSET(bp->b_flags, (B_LOCKED|B_ERROR)) == (B_LOCKED|B_ERROR)) + CLR(bp->b_flags, B_ERROR); +#endif /* WAPBL */ + /* If it's not cacheable, or an error, mark it invalid. */ if (ISSET(bp->b_flags, (B_NOCACHE|B_ERROR))) SET(bp->b_flags, B_INVAL); diff --git a/sys/kern/vfs_biomem.c b/sys/kern/vfs_biomem.c index da0a355..2d7fb29 100644 --- a/sys/kern/vfs_biomem.c +++ b/sys/kern/vfs_biomem.c @@ -89,7 +89,7 @@ buf_acquire_nomap(struct buf *bp) { splassert(IPL_BIO); SET(bp->b_flags, B_BUSY); - if (bp->b_data != NULL) { + if (bp->b_data != NULL && !ISSET(bp->b_flags, B_LOCKED)) { TAILQ_REMOVE(_valist, bp, b_valist); bcstats.kvaslots_avail--; bcstats.busymapped++; @@ -143,6 +143,10 @@ buf_map(struct buf *bp) pmap_update(pmap_kernel()); bp->b_data = (caddr_t)va; } else { +#ifdef WAPBL + if (ISSET(bp->b_flags, B_LOCKED)) + return; +#endif /* WAPBL */ TAILQ_REMOVE(_valist, bp, b_valist); bcstats.kvaslots_avail--; } @@ -157,7 +161,7 @@ buf_release(struct buf *bp) KASSERT(bp->b_flags & B_BUSY); splassert(IPL_BIO); - if (bp->b_data) { + if (bp->b_data != NULL && !ISSET(bp->b_flags, B_LOCKED)) { bcstats.busymapped--; TAILQ_INSERT_TAIL(_valist, bp, b_valist); bcstats.kvaslots_avail++; @@ -191,6 +195,7 @@ buf_dealloc_mem(struct buf *bp) bp->b_data = NULL; if (data) { + KASSERT(!ISSET(bp->b_flags, B_LOCKED)); if (bp->b_flags & B_BUSY) bcstats.busymapped--; pmap_kremove((vaddr_t)data, bp->b_bufsize); @@ -237,6 +242,7 @@ buf_fix_mapping(struct buf *bp, vsize_t newsize) * buffers read in by bread_cluster */ bp->b_bufsize = newsize; + KASSERT(!ISSET(bp->b_flags, B_LOCKED)); } } diff --git a/sys/sys/buf.h b/sys/sys/buf.h index fd38c28..28b1a32 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -221,12 +221,14 @@ struct bufcache { #defineB_COLD 0x0100 /* buffer is on the cold queue */ #defineB_BC0x0200 /* buffer is managed by the cache */ #defineB_DMA 0x0400 /* buffer is DMA reachable */ +#defineB_LOCKED0x0800 /* Locked in core (not reusable). */ #defineB_BITS "\20\001AGE\002NEEDCOMMIT\003ASYNC\004BAD\005BUSY" \ "\006CACHE\007CALL\010DELWRI\011DONE\012EINTR\013ERROR" \ "\014INVAL\015NOCACHE\016PHYS\017RAW\020READ" \ "\021WANTED\022WRITEINPROG\023XXX(FORMAT)\024DEFERRED" \ -"\025SCANNED\026DAEMON\027RELEASED\030WARM\031COLD\032BC\033DMA" +"\025SCANNED\026DAEMON\027RELEASED\030WARM\031COLD\032BC\033DMA" \ +"\034LOCKED" /* * This structure describes a clustered I/O. It is stored in the b_saveaddr
WAPBL: Introducing B_LOCKED buffer flag
Introducing B_LOCKED buffer flag With this flag we can protect buffers that will be used by WAPBL while doing the log of transactions. -- Walter Neto diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 63bd7ca..aa0575f 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -743,6 +743,10 @@ brelse(struct buf *bp) * Determine which queue the buffer should be on, then put it there. */ + /* If it's locked, don't report an error; try again later */ + if (ISSET(bp->b_flags, (B_LOCKED|B_ERROR)) == (B_LOCKED|B_ERROR)) + CLR(bp->b_flags, B_ERROR); + /* If it's not cacheable, or an error, mark it invalid. */ if (ISSET(bp->b_flags, (B_NOCACHE|B_ERROR))) SET(bp->b_flags, B_INVAL); diff --git a/sys/kern/vfs_biomem.c b/sys/kern/vfs_biomem.c index da0a355..9f98c70 100644 --- a/sys/kern/vfs_biomem.c +++ b/sys/kern/vfs_biomem.c @@ -89,7 +89,7 @@ buf_acquire_nomap(struct buf *bp) { splassert(IPL_BIO); SET(bp->b_flags, B_BUSY); - if (bp->b_data != NULL) { + if (bp->b_data != NULL && !(bp->b_flags & B_LOCKED)) { TAILQ_REMOVE(_valist, bp, b_valist); bcstats.kvaslots_avail--; bcstats.busymapped++; @@ -143,8 +143,11 @@ buf_map(struct buf *bp) pmap_update(pmap_kernel()); bp->b_data = (caddr_t)va; } else { - TAILQ_REMOVE(_valist, bp, b_valist); - bcstats.kvaslots_avail--; + if (!(bp->b_flags & B_LOCKED)) { + TAILQ_REMOVE(_valist, bp, b_valist); + bcstats.kvaslots_avail--; + } else + return; } bcstats.busymapped++; @@ -157,7 +160,7 @@ buf_release(struct buf *bp) KASSERT(bp->b_flags & B_BUSY); splassert(IPL_BIO); - if (bp->b_data) { + if (bp->b_data && !(bp->b_flags & B_LOCKED)) { bcstats.busymapped--; TAILQ_INSERT_TAIL(_valist, bp, b_valist); bcstats.kvaslots_avail++; @@ -191,6 +194,7 @@ buf_dealloc_mem(struct buf *bp) bp->b_data = NULL; if (data) { + KASSERT(!(bp->b_flags & B_LOCKED)); if (bp->b_flags & B_BUSY) bcstats.busymapped--; pmap_kremove((vaddr_t)data, bp->b_bufsize); @@ -237,6 +241,7 @@ buf_fix_mapping(struct buf *bp, vsize_t newsize) * buffers read in by bread_cluster */ bp->b_bufsize = newsize; + KASSERT(!(bp->b_flags & B_LOCKED)); } } diff --git a/sys/sys/buf.h b/sys/sys/buf.h index fd38c28..28b1a32 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -221,12 +221,14 @@ struct bufcache { #defineB_COLD 0x0100 /* buffer is on the cold queue */ #defineB_BC0x0200 /* buffer is managed by the cache */ #defineB_DMA 0x0400 /* buffer is DMA reachable */ +#defineB_LOCKED0x0800 /* Locked in core (not reusable). */ #defineB_BITS "\20\001AGE\002NEEDCOMMIT\003ASYNC\004BAD\005BUSY" \ "\006CACHE\007CALL\010DELWRI\011DONE\012EINTR\013ERROR" \ "\014INVAL\015NOCACHE\016PHYS\017RAW\020READ" \ "\021WANTED\022WRITEINPROG\023XXX(FORMAT)\024DEFERRED" \ -"\025SCANNED\026DAEMON\027RELEASED\030WARM\031COLD\032BC\033DMA" +"\025SCANNED\026DAEMON\027RELEASED\030WARM\031COLD\032BC\033DMA" \ +"\034LOCKED" /* * This structure describes a clustered I/O. It is stored in the b_saveaddr
WAPBL: Introducing buf_adjcnt()
Introducing buf_adjcnt() from Bitrig. It is needed to notify WAPBL that the buffer has size changes. -- Walter Neto diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 9bf9b36..459b4fa 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -1206,6 +1206,14 @@ bcstats_print( } #endif +void +buf_adjcnt(struct buf *bp, long ncount) +{ + KASSERT(ncount <= bp->b_bufsize); + long ocount = bp->b_bcount; + bp->b_bcount = ncount; +} + /* bufcache freelist code below */ /* * Copyright (c) 2014 Ted Unangst <t...@openbsd.org> diff --git a/sys/sys/buf.h b/sys/sys/buf.h index 5d71a5f..2336c9a 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -292,6 +292,7 @@ voidbrelse(struct buf *); void bufinit(void); void buf_dirty(struct buf *); voidbuf_undirty(struct buf *); +void buf_adjcnt(struct buf *, long); intbwrite(struct buf *); struct buf *getblk(struct vnode *, daddr_t, int, int, int); struct buf *geteblk(int); diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index 87840a1..42e9a14 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -218,7 +218,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (bpp != NULL) { if ((error = bread(ITOV(ip), lbprev, fs->fs_bsize, )) != 0) goto error; - bp->b_bcount = osize; + buf_adjcnt(bp, osize); } if ((error = ufs_quota_alloc_blocks(ip, btodb(nsize - osize), cred)) @@ -241,7 +241,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (nsize > bp->b_bufsize) panic("ffs_realloccg: small buf"); #endif - bp->b_bcount = nsize; + buf_adjcnt(bp, nsize); bp->b_flags |= B_DONE; memset(bp->b_data + osize, 0, nsize - osize); *bpp = bp; @@ -313,7 +313,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (nsize > bp->b_bufsize) panic("ffs_realloccg: small buf 2"); #endif - bp->b_bcount = nsize; + buf_adjcnt(bp, nsize); bp->b_flags |= B_DONE; memset(bp->b_data + osize, 0, nsize - osize); *bpp = bp; diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c index e4b21cd..f42bfcb 100644 --- a/sys/ufs/ffs/ffs_balloc.c +++ b/sys/ufs/ffs/ffs_balloc.c @@ -165,7 +165,7 @@ ffs1_balloc(struct inode *ip, off_t startoffset, int size, struct ucred *cred, brelse(*bpp); return (error); } - (*bpp)->b_bcount = osize; + buf_adjcnt((*bpp), osize); } return (0); } else { @@ -535,7 +535,7 @@ ffs2_balloc(struct inode *ip, off_t off, int size, struct ucred *cred, brelse(*bpp); return (error); } - (*bpp)->b_bcount = osize; + buf_adjcnt((*bpp), osize); } return (0); diff --git a/sys/ufs/ffs/ffs_inode.c b/sys/ufs/ffs/ffs_inode.c index 670a23e..6b66f99 100644 --- a/sys/ufs/ffs/ffs_inode.c +++ b/sys/ufs/ffs/ffs_inode.c @@ -262,7 +262,7 @@ ffs_truncate(struct inode *oip, off_t length, int flags, struct ucred *cred) (void) uvm_vnp_uncache(ovp); if (ovp->v_type != VDIR) memset(bp->b_data + offset, 0, size - offset); - bp->b_bcount = size; + buf_adjcnt(bp, size); if (aflags & B_SYNC) bwrite(bp); else diff --git a/sys/ufs/ffs/ffs_subr.c b/sys/ufs/ffs/ffs_subr.c index c075f1e..dcbb4ac 100644 --- a/sys/ufs/ffs/ffs_subr.c +++ b/sys/ufs/ffs/ffs_subr.c @@ -72,7 +72,7 @@ ffs_bufatoff(struct inode *ip, off_t offset, char **res, struct buf **bpp) brelse(bp); return (error); } - bp->b_bcount = bsize; + buf_adjcnt(bp, bsize); if (res) *res = (char *)bp->b_data + blkoff(fs, offset); *bpp = bp;
Re: WAPBL: Introducing buf_adjcnt()
Sorry guys, my bad. It will not compile, there is a warning. Here is the correct diff. -- Walter Neto diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 2ce876a..63bd7ca 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -1206,6 +1206,13 @@ bcstats_print( } #endif +void +buf_adjcnt(struct buf *bp, long ncount) +{ + KASSERT(ncount <= bp->b_bufsize); + bp->b_bcount = ncount; +} + /* bufcache freelist code below */ /* * Copyright (c) 2014 Ted Unangst <t...@openbsd.org> diff --git a/sys/sys/buf.h b/sys/sys/buf.h index ce1d35c..c47f3f9 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -292,6 +292,7 @@ voidbrelse(struct buf *); void bufinit(void); void buf_dirty(struct buf *); voidbuf_undirty(struct buf *); +void buf_adjcnt(struct buf *, long); intbwrite(struct buf *); struct buf *getblk(struct vnode *, daddr_t, int, int, int); struct buf *geteblk(int); diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index da4ff2a..08961b9 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -218,7 +218,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (bpp != NULL) { if ((error = bread(ITOV(ip), lbprev, fs->fs_bsize, )) != 0) goto error; - bp->b_bcount = osize; + buf_adjcnt(bp, osize); } if ((error = ufs_quota_alloc_blocks(ip, btodb(nsize - osize), cred)) @@ -241,7 +241,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (nsize > bp->b_bufsize) panic("ffs_realloccg: small buf"); #endif - bp->b_bcount = nsize; + buf_adjcnt(bp, nsize); bp->b_flags |= B_DONE; memset(bp->b_data + osize, 0, nsize - osize); *bpp = bp; @@ -313,7 +313,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, if (nsize > bp->b_bufsize) panic("ffs_realloccg: small buf 2"); #endif - bp->b_bcount = nsize; + buf_adjcnt(bp, nsize); bp->b_flags |= B_DONE; memset(bp->b_data + osize, 0, nsize - osize); *bpp = bp; diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c index 16f9b6f..ef9f6d4 100644 --- a/sys/ufs/ffs/ffs_balloc.c +++ b/sys/ufs/ffs/ffs_balloc.c @@ -165,7 +165,7 @@ ffs1_balloc(struct inode *ip, off_t startoffset, int size, struct ucred *cred, brelse(*bpp); return (error); } - (*bpp)->b_bcount = osize; + buf_adjcnt((*bpp), osize); } return (0); } else { @@ -535,7 +535,7 @@ ffs2_balloc(struct inode *ip, off_t off, int size, struct ucred *cred, brelse(*bpp); return (error); } - (*bpp)->b_bcount = osize; + buf_adjcnt((*bpp), osize); } return (0); diff --git a/sys/ufs/ffs/ffs_inode.c b/sys/ufs/ffs/ffs_inode.c index 1eaa8b5..25c5fd5 100644 --- a/sys/ufs/ffs/ffs_inode.c +++ b/sys/ufs/ffs/ffs_inode.c @@ -262,7 +262,7 @@ ffs_truncate(struct inode *oip, off_t length, int flags, struct ucred *cred) (void) uvm_vnp_uncache(ovp); if (ovp->v_type != VDIR) memset(bp->b_data + offset, 0, size - offset); - bp->b_bcount = size; + buf_adjcnt(bp, size); if (aflags & B_SYNC) bwrite(bp); else diff --git a/sys/ufs/ffs/ffs_subr.c b/sys/ufs/ffs/ffs_subr.c index 938af62..4282779 100644 --- a/sys/ufs/ffs/ffs_subr.c +++ b/sys/ufs/ffs/ffs_subr.c @@ -72,7 +72,7 @@ ffs_bufatoff(struct inode *ip, off_t offset, char **res, struct buf **bpp) brelse(bp); return (error); } - bp->b_bcount = bsize; + buf_adjcnt(bp, bsize); if (res) *res = (char *)bp->b_data + blkoff(fs, offset); *bpp = bp;
Re: WAPBL: Introducing buf_adjcnt()
On Thu, Nov 26, 2015 at 11:01:58AM -0700, Bob Beck wrote: > Duhh.. my bad walter. it's early.. as of yet undercaffinated. > No problem buddy!! :)
WAPBL: Adding the FFS capability to alloc files contiguously
This diff adds B_CONTIG and B_METAONLY low-level allocation flags, and the code to FFS allocate contiguously files. This will be used to alloc WAPBL log file. -- Walter Neto diff --git a/sys/sys/buf.h b/sys/sys/buf.h index c47f3f9..fd38c28 100644 --- a/sys/sys/buf.h +++ b/sys/sys/buf.h @@ -254,6 +254,8 @@ struct cluster_save { /* Flags to low-level allocation routines. */ #define B_CLRBUF 0x01/* Request allocated buffer be cleared. */ #define B_SYNC 0x02/* Do all allocations synchronously. */ +#define B_METAONLY 0x04/* return indirect block buffer */ +#define B_CONTIG 0x08/* allocate file contiguously */ struct cluster_info { daddr_t ci_lastr; /* last read (read-ahead) */ diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index 08961b9..ef0c518 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -63,16 +63,19 @@ (fs)->fs_fsmnt, (cp)); \ } while (0) -daddr_tffs_alloccg(struct inode *, int, daddr_t, int); +daddr_tffs_alloccg(struct inode *, int, daddr_t, int, int); struct buf * ffs_cgread(struct fs *, struct inode *, int); -daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t); -daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int); +daddr_tffs_alloccgblk(struct inode *, struct buf *, daddr_t, int); +daddr_tffs_clusteralloc(struct inode *, int, daddr_t, int, int); ufsino_t ffs_dirpref(struct inode *); daddr_tffs_fragextend(struct inode *, int, daddr_t, int, int); -daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, - daddr_t (*)(struct inode *, int, daddr_t, int)); -daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int); +daddr_tffs_hashalloc(struct inode *, int, daddr_t, int, int, +daddr_t (*)(struct inode *, int, daddr_t, int, int)); +daddr_tffs_nodealloccg(struct inode *, int, daddr_t, int, int); daddr_tffs_mapsearch(struct fs *, struct cg *, daddr_t, int); +void ffs_blkfree_subr(struct fs *, struct vnode *, + struct inode *, daddr_t bno, long size); + int ffs1_reallocblks(void *); #ifdef FFS2 @@ -106,7 +109,7 @@ static const struct timeval fserr_interval = { 2, 0 }; * available block is located. */ int -ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, +ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, int flags, struct ucred *cred, daddr_t *bnp) { static struct timeval fsfull_last; @@ -147,7 +150,7 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, cg = dtog(fs, bpref); /* Try allocating a block. */ - bno = ffs_hashalloc(ip, cg, bpref, size, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, size, flags, ffs_alloccg); if (bno > 0) { /* allocation successful, update inode data */ DIP_ADD(ip, blocks, btodb(size)); @@ -159,6 +162,12 @@ ffs_alloc(struct inode *ip, daddr_t lbn, daddr_t bpref, int size, /* Restore user's disk quota because allocation failed. */ (void) ufs_quota_free_blocks(ip, btodb(size), cred); + if (flags & B_CONTIG) { + /* +* Fail silently -- it's up to our caller to report errors. +*/ + return (ENOSPC); + } nospace: if (ratecheck(_last, _interval)) { ffs_fserr(fs, cred->cr_uid, "file system full"); @@ -178,7 +187,7 @@ nospace: */ int ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, -int nsize, struct ucred *cred, struct buf **bpp, daddr_t *blknop) +int nsize, int flags, struct ucred *cred, struct buf **bpp, daddr_t *blknop) { static struct timeval fsfull_last; struct fs *fs; @@ -295,7 +304,7 @@ ffs_realloccg(struct inode *ip, daddr_t lbprev, daddr_t bpref, int osize, panic("ffs_realloccg: bad optim"); /* NOTREACHED */ } - bno = ffs_hashalloc(ip, cg, bpref, request, ffs_alloccg); + bno = ffs_hashalloc(ip, cg, bpref, request, flags, ffs_alloccg); if (bno <= 0) goto nospace; @@ -434,7 +443,7 @@ ffs1_reallocblks(void *v) /* * Find the preferred location for the cluster. */ - pref = ffs1_blkpref(ip, start_lbn, soff, sbap); + pref = ffs1_blkpref(ip, start_lbn, soff, 0, sbap); /* * If the block range spans two block maps, get the second map. */ @@ -454,7 +463,7 @@ ffs1_reallocblks(void *v) /* * Search the block map looking for an allocation of the desired size. */ - if ((newblk = ffs_hashalloc(ip, dtog(fs, pref), pref, len, + if ((newb
Changes needed in buffercache(9) for WAPBL
Changes needed in buffercache(9) for WAPBL - All changes needed in vfs_bio.c - Adding WAPBL headers - Introducing buf_adjcnt to inform wapbl when a buffer has changed its size (get it from Bitrig) Hi guys, with this diff I'm trying to introduce WAPBL to OpenBSD a little more splitted than before. If someone want to see all the implementation, you can get it on https://github.com/radixo/openbsd-src/tree/wapbl I hope to get the feature as soon as possible in OpenBSD. Thanks in advance guys. :) Index: sys/kern/vfs_bio.c === RCS file: /Volumes/CSP/cvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.170 diff -u -r1.170 vfs_bio.c --- sys/kern/vfs_bio.c 19 Jul 2015 16:21:11 - 1.170 +++ sys/kern/vfs_bio.c 25 Nov 2015 14:37:01 - @@ -56,7 +56,7 @@ #include #include #include -#include +#include #include int nobuffers; @@ -556,6 +556,16 @@ mp = NULL; /* +* If using WAPBL, convert it to a delayed write +*/ + if (mp && mp->mnt_wapbl) { + if (bp->b_iodone != mp->mnt_wapbl_op->wo_wapbl_biodone) { + bdwrite(bp); + return 0; + } + } + + /* * Remember buffer type, to switch on it later. If the write was * synchronous, but the file system was mounted with MNT_ASYNC, * convert it to a delayed write. @@ -628,7 +638,6 @@ return (rv); } - /* * Delayed write. * @@ -647,6 +656,20 @@ { int s; + /* If this is a tape block, write the block now. */ + if (major(bp->b_dev) < nblkdev && + bdevsw[major(bp->b_dev)].d_type == D_TAPE) { + bawrite(bp); + return; + } + + if (wapbl_vphaswapbl(bp->b_vp)) { + struct mount *mp = wapbl_vptomp(bp->b_vp); + + if (bp->b_iodone != mp->mnt_wapbl_op->wo_wapbl_biodone) + WAPBL_ADD_BUF(mp, bp); + } + /* * If the block hasn't been seen before: * (1) Mark it as having been seen, @@ -663,13 +686,6 @@ curproc->p_ru.ru_oublock++; /* XXX */ } - /* If this is a tape block, write the block now. */ - if (major(bp->b_dev) < nblkdev && - bdevsw[major(bp->b_dev)].d_type == D_TAPE) { - bawrite(bp); - return; - } - /* Otherwise, the "write" is done, so mark and release the buffer. */ CLR(bp->b_flags, B_NEEDCOMMIT); SET(bp->b_flags, B_DONE); @@ -743,12 +759,28 @@ * Determine which queue the buffer should be on, then put it there. */ + /* If it's locked, don't report an error; try again later */ + if (ISSET(bp->b_flags, (B_LOCKED|B_ERROR)) == (B_LOCKED|B_ERROR)) + CLR(bp->b_flags, B_ERROR); + /* If it's not cacheable, or an error, mark it invalid. */ if (ISSET(bp->b_flags, (B_NOCACHE|B_ERROR))) SET(bp->b_flags, B_INVAL); if (ISSET(bp->b_flags, B_INVAL)) { /* +* If using WAPBL +*/ + if (ISSET(bp->b_flags, B_LOCKED)) { + if (wapbl_vphaswapbl(bp->b_vp)) { + struct mount *mp = wapbl_vptomp(bp->b_vp); + KASSERT(bp->b_iodone + != mp->mnt_wapbl_op->wo_wapbl_biodone); + WAPBL_REMOVE_BUF(mp, bp); + } + } + + /* * If the buffer is invalid, free it now rather than leaving * it in a queue and wasting memory. */ @@ -1079,6 +,19 @@ if (!ISSET(bp->b_flags, B_DELWRI)) panic("Clean buffer on dirty queue"); #endif + + +#ifdef WAPBL + if (ISSET(bp->b_flags, B_LOCKED) && + wapbl_vphaswapbl(bp->b_vp)) { + brelse(bp); + struct mount *mp = wapbl_vptomp(bp->b_vp); + wapbl_flush(mp->mnt_wapbl, 1); + s = splbio(); + continue; + } +#endif /* WAPBL */ + if (LIST_FIRST(>b_dep) != NULL && !ISSET(bp->b_flags, B_DEFERRED) && buf_countdeps(bp, 0, 0)) { @@ -1206,6 +1251,17 @@ } #endif +void +buf_adjcnt(struct buf *bp, long ncount) +{ + KASSERT(ncount <= bp->b_bufsize); + long ocount = bp->b_bcount; + bp->b_bcount = ncount; + if (wapbl_vphaswapbl(bp->b_vp)) + WAPBL_RESIZE_BUF(wapbl_vptomp(bp->b_vp), bp, bp->b_bufsize, + ocount); +} + /* bufcache freelist code
Re: Changes needed in buffercache(9) for WAPBL
On Wed, Nov 25, 2015 at 11:01:40AM -0700, Bob Beck wrote: > At first go, you need a bit more #ifdef WAPBL in there.. the idea > being we want to be able to build kernels selectively with > and without WAPBL compiled in at all and initially we may not (by > default) put it in GENERIC. > > I'll mail you a diff shortly. Thanks Bob, I'm waiting for your diff to send my new attempt of a clear and acceptable diff. mpi@ talked with me too. And give me some tips. > > On Wed, Nov 25, 2015 at 10:55 AM, Bob Beck <b...@openbsd.org> wrote: > > Nice walter.. I was just going to start separating this out myself and > > theo distracted me :) > > > > I'll take a look at this right away. > > > > > > On Wed, Nov 25, 2015 at 8:27 AM, Walter Neto <wsouz...@gmail.com> wrote: > >> Changes needed in buffercache(9) for WAPBL > >> > >> - All changes needed in vfs_bio.c > >> - Adding WAPBL headers > >> - Introducing buf_adjcnt to inform wapbl when a buffer has changed its > >> size (get > >> it from Bitrig) > >> > >> Hi guys, with this diff I'm trying to introduce WAPBL to OpenBSD a > >> little more splitted than before. > >> > >> If someone want to see all the implementation, you can get it on > >> https://github.com/radixo/openbsd-src/tree/wapbl > >> > >> I hope to get the feature as soon as possible in OpenBSD. > >> Thanks in advance guys. :) > >> > >> Index: sys/kern/vfs_bio.c > >> === > >> RCS file: /Volumes/CSP/cvs/src/sys/kern/vfs_bio.c,v > >> retrieving revision 1.170 > >> diff -u -r1.170 vfs_bio.c > >> --- sys/kern/vfs_bio.c 19 Jul 2015 16:21:11 - 1.170 > >> +++ sys/kern/vfs_bio.c 25 Nov 2015 14:37:01 - > >> @@ -56,7 +56,7 @@ > >> #include > >> #include > >> #include > >> -#include > >> +#include > >> #include > >> > >> int nobuffers; > >> @@ -556,6 +556,16 @@ > >> mp = NULL; > >> > >> /* > >> +* If using WAPBL, convert it to a delayed write > >> +*/ > >> + if (mp && mp->mnt_wapbl) { > >> + if (bp->b_iodone != mp->mnt_wapbl_op->wo_wapbl_biodone) { > >> + bdwrite(bp); > >> + return 0; > >> + } > >> + } > >> + > >> + /* > >> * Remember buffer type, to switch on it later. If the write was > >> * synchronous, but the file system was mounted with MNT_ASYNC, > >> * convert it to a delayed write. > >> @@ -628,7 +638,6 @@ > >> return (rv); > >> } > >> > >> - > >> /* > >> * Delayed write. > >> * > >> @@ -647,6 +656,20 @@ > >> { > >> int s; > >> > >> + /* If this is a tape block, write the block now. */ > >> + if (major(bp->b_dev) < nblkdev && > >> + bdevsw[major(bp->b_dev)].d_type == D_TAPE) { > >> + bawrite(bp); > >> + return; > >> + } > >> + > >> + if (wapbl_vphaswapbl(bp->b_vp)) { > >> + struct mount *mp = wapbl_vptomp(bp->b_vp); > >> + > >> + if (bp->b_iodone != mp->mnt_wapbl_op->wo_wapbl_biodone) > >> + WAPBL_ADD_BUF(mp, bp); > >> + } > >> + > >> /* > >> * If the block hasn't been seen before: > >> * (1) Mark it as having been seen, > >> @@ -663,13 +686,6 @@ > >> curproc->p_ru.ru_oublock++; /* XXX */ > >> } > >> > >> - /* If this is a tape block, write the block now. */ > >> - if (major(bp->b_dev) < nblkdev && > >> - bdevsw[major(bp->b_dev)].d_type == D_TAPE) { > >> - bawrite(bp); > >> - return; > >> - } > >> - > >> /* Otherwise, the "write" is done, so mark and release the buffer. > >> */ > >> CLR(bp->b_flags, B_NEEDCOMMIT); > >> SET(bp->b_flags, B_DONE); > >> @@ -743,12 +759,28 @@ > >>
Re: WAPBL implementation
Adding WAPBL support for dumpfs(8) next diffs: - tunefs(8) showing log information and setting log size - fsck_ffs(8) WAPBL support ok jasper@ Index: sbin/dumpfs/dumpfs.c === RCS file: /Volumes/CSP/cvs/src/sbin/dumpfs/dumpfs.c,v retrieving revision 1.32 diff -u -r1.32 dumpfs.c --- sbin/dumpfs/dumpfs.c20 Jan 2015 18:22:21 - 1.32 +++ sbin/dumpfs/dumpfs.c28 Oct 2015 10:40:26 - @@ -40,14 +40,18 @@ #include /* DEV_BSIZE MAXBSIZE isset */ #include +#include +#include #include +#include #include #include #include #include #include +#include #include #include #include @@ -68,12 +72,26 @@ } cgun; #define acgcgun.cg -intdumpfs(int, const char *); -intdumpcg(const char *, int, int); -intmarshal(const char *); -intopen_disk(const char *); -void pbits(void *, int); -__dead voidusage(void); +union { + struct wapbl_wc_header wh; + struct wapbl_wc_null wn; + char pad[MAXBSIZE]; +} jbuf; +#define awhjbuf.wh +#define awnjbuf.wn + +int dojournal = 0; + +int dumpfs(int, const char *); +int dumpcg(const char *, int, int); +int marshal(const char *); +int open_disk(const char *); +voidpbits(void *, int); +int print_journal(const char *, int); +const char *wapbl_type_string(unsigned); +voidprint_journal_header(const char *); +off_t print_journal_entries(const char *, size_t); +__dead void usage(void); int main(int argc, char *argv[]) @@ -84,8 +102,11 @@ domarshal = eval = 0; - while ((ch = getopt(argc, argv, "m")) != -1) { + while ((ch = getopt(argc, argv, "jm")) != -1) { switch (ch) { + case 'j': /* WAPBL journal */ + dojournal = 1; + break; case 'm': domarshal = 1; break; @@ -258,6 +279,15 @@ afs.fs_cgrotor, afs.fs_fmod, afs.fs_ronly, afs.fs_clean); printf("avgfpdir %d\tavgfilesize %d\n", afs.fs_avgfpdir, afs.fs_avgfilesize); + if (dojournal) { + printf("wapbl version 0x%x\tlocation %u\tflags 0x%x\n", + afs.fs_journal_version, afs.fs_journal_location, + afs.fs_journal_flags); + printf("wapbl loc0 %llu\tloc1 %llu", + afs.fs_journallocs[0], afs.fs_journallocs[1]); + printf("\tloc2 %llu\tloc3 %llu\n", + afs.fs_journallocs[2], afs.fs_journallocs[3]); + } printf("flags\t"); if (afs.fs_magic == FS_UFS2_MAGIC || afs.fs_ffs1_flags & FS_FLAGS_UPDATED) @@ -270,6 +300,8 @@ printf("unclean "); if (fsflags & FS_DOSOFTDEP) printf("soft-updates "); + if (fsflags & FS_DOWAPBL) + printf("wapbl "); if (fsflags & FS_FLAGS_UPDATED) printf("updated "); #if 0 @@ -281,6 +313,10 @@ printf("fsmnt\t%s\n", afs.fs_fsmnt); printf("volname\t%s\tswuid\t%ju\n", afs.fs_volname, (uintmax_t)afs.fs_swuid); + if (dojournal) { + printf("\n"); + print_journal(name, fd); + } printf("\ncs[].cs_(nbfree,ndir,nifree,nffree):\n\t"); afs.fs_csp = calloc(1, afs.fs_cssize); for (i = 0, j = 0; i < afs.fs_cssize; i += afs.fs_bsize, j++) { @@ -457,9 +493,182 @@ printf("\n"); } +int +print_journal(const char *name, int fd) +{ + daddr_t off; + size_t count, blklen, bno, skip; + off_t boff, head, tail, len; + uint32_t generation; + + if (afs.fs_journal_version != UFS_WAPBL_VERSION) + return 0; + + generation = 0; + head = tail = 0; + + switch (afs.fs_journal_location) { + case UFS_WAPBL_JOURNALLOC_END_PARTITION: + case UFS_WAPBL_JOURNALLOC_IN_FILESYSTEM: + + off= afs.fs_journallocs[0]; + count = afs.fs_journallocs[1]; + blklen = afs.fs_journallocs[2]; + + for (bno=0; bno= 2 * blklen && + ((head >= tail && (boff < tail || boff >= head)) || + (head < tail && (boff >= head && boff < tail + continue; + + printf("journal block %lu offset %lld\n", + (unsigned long)bno, (long long) boff); + + if (lseek(fd, (off_t)(off*blklen) + boff, SEEK_SET) + == (off_t)-1) + return (1); + if (read(fd, , blklen) != (ssize_t)blklen) { +
WAPBL implementation
Like recommended from other developers I started developing WAPBL support for OpenBSD. Looking at NetBSD and Bitrig I mage a first funcional patch. Index: sbin/mount/mntopts.h === RCS file: /Volumes/CSP/cvs/src/sbin/mount/mntopts.h,v retrieving revision 1.16 diff -u -r1.16 mntopts.h --- sbin/mount/mntopts.h13 Jul 2014 12:01:30 - 1.16 +++ sbin/mount/mntopts.h23 Oct 2015 15:07:07 - @@ -66,6 +66,8 @@ | MFLAG_OPT } #define MOPT_SOFTDEP { "softdep",MNT_SOFTDEP, MFLAG_SET } +#define MOPT_LOG { "log",MNT_LOG, MFLAG_SET } + /* Control flags. */ #define MOPT_FORCE { "force", MNT_FORCE, MFLAG_SET } #define MOPT_UPDATE{ "update", MNT_UPDATE, MFLAG_SET } Index: sbin/mount/mount.c === RCS file: /Volumes/CSP/cvs/src/sbin/mount/mount.c,v retrieving revision 1.60 diff -u -r1.60 mount.c --- sbin/mount/mount.c 16 Jan 2015 06:39:59 - 1.60 +++ sbin/mount/mount.c 23 Oct 2015 15:07:07 - @@ -94,6 +94,7 @@ { MNT_ROOTFS, 1, "root file system", "" }, { MNT_SYNCHRONOUS, 0, "synchronous", "sync" }, { MNT_SOFTDEP, 0, "softdep", "softdep" }, + { MNT_LOG, 0, "log", "log" }, { 0,0, "", "" } }; Index: sbin/mount_ffs/mount_ffs.c === RCS file: /Volumes/CSP/cvs/src/sbin/mount_ffs/mount_ffs.c,v retrieving revision 1.21 diff -u -r1.21 mount_ffs.c --- sbin/mount_ffs/mount_ffs.c 16 Jan 2015 06:39:59 - 1.21 +++ sbin/mount_ffs/mount_ffs.c 23 Oct 2015 15:07:07 - @@ -53,6 +53,7 @@ MOPT_RELOAD, MOPT_FORCE, MOPT_SOFTDEP, + MOPT_LOG, { NULL } }; Index: sys/conf/GENERIC === RCS file: /Volumes/CSP/cvs/src/sys/conf/GENERIC,v retrieving revision 1.220 diff -u -r1.220 GENERIC --- sys/conf/GENERIC10 Aug 2015 20:35:36 - 1.220 +++ sys/conf/GENERIC23 Oct 2015 15:07:07 - @@ -43,6 +43,7 @@ option FIFO# FIFOs; RECOMMENDED option TMPFS # efficient memory file system option FUSE# FUSE +option WAPBL # Write Ahead Physical Block Logging option SOCKET_SPLICE # Socket Splicing for TCP and UDP option TCP_SACK# Selective Acknowledgements for TCP Index: sys/conf/files === RCS file: /Volumes/CSP/cvs/src/sys/conf/files,v retrieving revision 1.604 diff -u -r1.604 files --- sys/conf/files 9 Oct 2015 01:17:21 - 1.604 +++ sys/conf/files 23 Oct 2015 15:07:07 - @@ -732,6 +732,7 @@ file kern/vfs_vops.c file kern/vfs_vnops.c file kern/vfs_getcwd.c +file kern/vfs_wapbl.c wapbl file kern/spec_vnops.c file miscfs/deadfs/dead_vnops.c file miscfs/fifofs/fifo_vnops.cfifo @@ -887,6 +888,7 @@ file ufs/ffs/ffs_vfsops.c ffs | mfs file ufs/ffs/ffs_vnops.c ffs | mfs file ufs/ffs/ffs_softdep.c ffs_softupdates +file ufs/ffs/ffs_wapbl.c ffs & wapbl file ufs/mfs/mfs_vfsops.c mfs file ufs/mfs/mfs_vnops.c mfs file ufs/ufs/ufs_bmap.cffs | mfs | ext2fs @@ -898,6 +900,7 @@ file ufs/ufs/ufs_quota_stub.c ffs | mfs file ufs/ufs/ufs_vfsops.c ffs | mfs | ext2fs file ufs/ufs/ufs_vnops.c ffs | mfs | ext2fs +file ufs/ufs/ufs_wapbl.c ffs & wapbl file ufs/ext2fs/ext2fs_alloc.c ext2fs file ufs/ext2fs/ext2fs_balloc.cext2fs file ufs/ext2fs/ext2fs_bmap.c ext2fs Index: sys/kern/spec_vnops.c === RCS file: /Volumes/CSP/cvs/src/sys/kern/spec_vnops.c,v retrieving revision 1.83 diff -u -r1.83 spec_vnops.c --- sys/kern/spec_vnops.c 10 Feb 2015 21:56:09 - 1.83 +++ sys/kern/spec_vnops.c 23 Oct 2015 15:07:07 - @@ -408,6 +408,10 @@ return (EOPNOTSUPP); } +#ifdef WAPBL +extern int ffs_wapbl_fsync_vfs(struct vnode *, int); +#endif + /* * Synch buffers associated with a block device */ @@ -422,6 +426,15 @@ if (vp->v_type == VCHR) return (0); + + +#ifdef WAPBL + if (vp->v_type == VBLK && + vp->v_specmountpoint != NULL && + vp->v_specmountpoint->mnt_wapbl != NULL) + return (ffs_wapbl_fsync_vfs(vp, ap->a_waitfor)); +#endif + /* * Flush all dirty buffers associated with a block device. */ Index: sys/kern/vfs_bio.c
Re: Journaled Soft Updates
Hi guys, Studying more about wapbl, I saw It is a little faster than SU+J and so much more easy to implement, correct me if I’m wrong, please. About a diff with working code, my next would be one with it, but not with the whole thing working, It was to be homeopathic. But now I have doubts whether to continue SU+J implementation or to implement WAPBL (more easy, cause Pedro had implemented in bitrig) Any suggestions? > On Sep 1, 2015, at 8:08 PM, Bob Beck <b...@openbsd.org> wrote: > > I would much rather see something llike wapbl ported than this.. I > believe there be dragons here. > > You're welcome to try, but I anticipate heartbraak ;) > > If you're going to try I'd rather see a diff you got *working* rather > than just the structure definitions for > something that might never be brought to fruition. > > > On Tue, Sep 1, 2015 at 3:14 PM, Ted Unangst <t...@tedunangst.com> wrote: >> Walter Neto wrote: >>> Hi, >>> >>> Here is the first patch to bring Journaling to OpenBSD based on the >>> McKusick paper: >>> https://www.bsdcan.org/2010/schedule/attachments/141_suj-slides.pd and >>> FreeBSD 10 >>> >>> This first patch is just for structures and definitions. >>> >>> I know is my first patch suggestion and this feature is not usual, but I >>> am fully focused on making it work, and I hope to receive help from >>> expert developers. >> >> Yikes! >> >> Do I understand correctly that you do not currently have anything working? >> >> My opinion is that this feature is too complicated and disruptive. My >> takeaway >> from the talk was "we got it to work, but it was a lot harder than we thought >> and probably wouldn't do it again". I think I can safely speak for most of >> the >> other openbsd developers that we are kind of scared of any large (or even >> small!) change to the softdep code. >> >> If this is a topic that interests you, I think the NetBSD wapbl is a simpler >> approach. There is even code in bitrig which can likely be made to work on >> openbsd. >> >> The softdep approach is very interesting, so don't let me stop you from >> studying it, but if you want something a little more "practical" I think >> there >> are better things to do. >>
Journaled Soft Updates
Hi, Here is the first patch to bring Journaling to OpenBSD based on the McKusick paper: https://www.bsdcan.org/2010/schedule/attachments/141_suj-slides.pd and FreeBSD 10 This first patch is just for structures and definitions. I know is my first patch suggestion and this feature is not usual, but I am fully focused on making it work, and I hope to receive help from expert developers. Walter Neto Index: sys/sys/mount.h === RCS file: /Volumes/CSP/cvs/src/sys/sys/mount.h,v retrieving revision 1.121 diff -u -u -r1.121 mount.h --- sys/sys/mount.h 8 Sep 2014 01:47:06 - 1.121 +++ sys/sys/mount.h 1 Sep 2015 20:23:03 - @@ -413,6 +413,7 @@ #define MNT_WANTRDWR 0x0200 /* want upgrade to read/write */ #define MNT_SOFTDEP 0x0400 /* soft dependencies being done */ #define MNT_DOOMED 0x0800 /* device behind filesystem is gone */ +#define MNT_SUJ0x1000 /* using journaled soft updates */ /* * Flags for various system call interfaces. Index: sys/ufs/ffs/softdep.h === RCS file: /Volumes/CSP/cvs/src/sys/ufs/ffs/softdep.h,v retrieving revision 1.17 diff -u -u -r1.17 softdep.h --- sys/ufs/ffs/softdep.h 11 Jun 2013 16:42:18 - 1.17 +++ sys/ufs/ffs/softdep.h 1 Sep 2015 20:23:03 - @@ -165,6 +165,12 @@ LIST_HEAD(allocindirhd, allocindir); LIST_HEAD(allocdirecthd, allocdirect); TAILQ_HEAD(allocdirectlst, allocdirect); +LIST_HEAD(jaddrefhd, jaddref); +LIST_HEAD(jremrefhd, jremref); +LIST_HEAD(jmvrefhd, jmvref); +LIST_HEAD(jnewblkhd, jnewblk); +LIST_HEAD(jblkdephd, jblkdep); +TAILQ_HEAD(jseglst, jseg); /* * The "pagedep" structure tracks the various dependencies related to @@ -587,3 +593,246 @@ # define db_state db_list.wk_state /* unused */ struct pagedep *db_pagedep;/* associated pagedep */ }; + +/* + * The inoref structure holds the elements common to jaddref and jremref + * so they may easily be queued in-order on the inodedep. + */ +struct inoref { + struct worklist if_list; /* Journal pending or jseg entries. */ +# define if_state if_list.wk_state + TAILQ_ENTRY(inoref) if_deps;/* Links for inodedep. */ + struct jsegdep *if_jsegdep;/* Will track our journal record. */ + doff_t if_diroff; /* Directory offset. */ + ufsino_tif_ino; /* Inode number. */ + ufsino_tif_parent; /* Parent inode number. */ + nlink_t if_nlink; /* nlink before addition. */ + uint16_tif_mode;/* File mode, needed for IFMT. */ +}; + +/* + * A "jaddref" structure tracks a new reference (link count) on an inode + * and prevents the link count increase and bitmap allocation until a + * journal entry can be written. Once the journal entry is written, + * the inode is put on the pendinghd of the bmsafemap and a diradd or + * mkdir entry is placed on the bufwait list of the inode. The DEPCOMPLETE + * flag is used to indicate that all of the required information for writing + * the journal entry is present. MKDIR_BODY and MKDIR_PARENT are used to + * differentiate . and .. links from regular file names. NEWBLOCK indicates + * a bitmap is still pending. If a new reference is canceled by a delete + * prior to writing the journal the jaddref write is canceled and the + * structure persists to prevent any disk-visible changes until it is + * ultimately released when the file is freed or the link is dropped again. + */ +struct jaddref { + struct inoref ja_ref; /* see inoref above. */ +# define ja_list ja_ref.if_list /* Jrnl pending, id_inowait, dm_jwork.*/ +# define ja_state ja_ref.if_list.wk_state + LIST_ENTRY(jaddref) ja_bmdeps; /* Links for bmsafemap. */ + union { + struct diradd *jau_diradd;/* Pending diradd. */ + struct mkdir *jau_mkdir; /* MKDIR_{PARENT,BODY} */ + } ja_un; +}; +#defineja_diradd ja_un.jau_diradd +#defineja_mkdirja_un.jau_mkdir +#defineja_diroff ja_ref.if_diroff +#defineja_ino ja_ref.if_ino +#defineja_parent ja_ref.if_parent +#defineja_mode ja_ref.if_mode + +/* + * A "jremref" structure tracks a removed reference (unlink) on an + * inode and prevents the directory remove from proceeding until the + * journal entry is written. Once the journal has been written the remove + * may proceed as normal. + */ +struct jremref { + struct inoref jr_ref; /* see inoref above. */ +# define jr_list jr_ref.if_list /* Linked to softdep_journal_pending. */ +# define jr_state jr_ref.if_list.wk_state + LIST_ENTRY(jremref) jr_deps;/* Links for dirrem. */ + struct dirrem *jr_dirrem;
Re: Journal Implementation
On Jun 3, 2015, at 12:52 PM, Walter Neto wsouz...@gmail.com wrote: Analising the tips, I decided to implement one given by Paul, It is less dramatic, and solves the problem. Analysing* sorry the english.. I’ll improve! ok? On Jun 3, 2015, at 2:37 AM, Ville Valkonen weezeld...@gmail.com wrote: Hi, On Jun 3, 2015 3:17 AM, Walter Neto wsouz...@gmail.com wrote: Thanks guys.. I will read all the tips, and start to code.. Once I have a diff I share.. On Jun 2, 2015, at 9:06 PM, Walter Neto wsouz...@gmail.com wrote: On Jun 2, 2015, at 5:03 PM, Paul de Weerd we...@weirdnet.nl wrote: On Tue, Jun 02, 2015 at 07:33:58PM +, Stefan wrote: | http://www.openbsd.org/faq/faq8.html#Journaling Right, that doesn't help, it's not a tip for someone interested in *developing a journaling system for UFS*... You can rest assured they're already aware that OpenBSD doesn't support journaling. | On Tue, Jun 2, 2015, 2:31 PM Walter Neto wsouz...@gmail.com wrote: | Hy.. | | I want to help OpenBSD developing a journaling system for UFS. | | Someone can give me a tip? Please have a look at this implementation from the FFS author: http://www.mckusick.com/softdep/suj.pdf I believe the code is available in FreeBSD. All it takes is porting. Before you start your work, keep in mind that there's no guarantee that it will be incorporated in OpenBSD. But, if you present your case with proper diffs, you should at least get some attention. Thank you Paul! I intend to help the OpenBSD project, cause is the OS wich I like best, and the feature is the one I need now. So, I will do my best! Good luck! Paul 'WEiRD' de Weerd -- [++-]+++.+++[---].+++[+ +++-].++[-]+.--.[-] http://www.weirdnet.nl/ have a look on Bitrig, they have implemented NetBSD's journaling and is an OpenBSD fork. -- Regards, Ville
Re: Journal Implementation
Analising the tips, I decided to implement one given by Paul, It is less dramatic, and solves the problem. ok? On Jun 3, 2015, at 2:37 AM, Ville Valkonen weezeld...@gmail.com wrote: Hi, On Jun 3, 2015 3:17 AM, Walter Neto wsouz...@gmail.com wrote: Thanks guys.. I will read all the tips, and start to code.. Once I have a diff I share.. On Jun 2, 2015, at 9:06 PM, Walter Neto wsouz...@gmail.com wrote: On Jun 2, 2015, at 5:03 PM, Paul de Weerd we...@weirdnet.nl wrote: On Tue, Jun 02, 2015 at 07:33:58PM +, Stefan wrote: | http://www.openbsd.org/faq/faq8.html#Journaling Right, that doesn't help, it's not a tip for someone interested in *developing a journaling system for UFS*... You can rest assured they're already aware that OpenBSD doesn't support journaling. | On Tue, Jun 2, 2015, 2:31 PM Walter Neto wsouz...@gmail.com wrote: | Hy.. | | I want to help OpenBSD developing a journaling system for UFS. | | Someone can give me a tip? Please have a look at this implementation from the FFS author: http://www.mckusick.com/softdep/suj.pdf I believe the code is available in FreeBSD. All it takes is porting. Before you start your work, keep in mind that there's no guarantee that it will be incorporated in OpenBSD. But, if you present your case with proper diffs, you should at least get some attention. Thank you Paul! I intend to help the OpenBSD project, cause is the OS wich I like best, and the feature is the one I need now. So, I will do my best! Good luck! Paul 'WEiRD' de Weerd -- [++-]+++.+++[---].+++[+ +++-].++[-]+.--.[-] http://www.weirdnet.nl/ have a look on Bitrig, they have implemented NetBSD's journaling and is an OpenBSD fork. -- Regards, Ville
Journal Implementation
Hy.. I want to help OpenBSD developing a journaling system for UFS. Someone can give me a tip? Thanks.
Re: Journal Implementation
Thanks guys.. I will read all the tips, and start to code.. Once I have a diff I share.. On Jun 2, 2015, at 9:06 PM, Walter Neto wsouz...@gmail.com wrote: On Jun 2, 2015, at 5:03 PM, Paul de Weerd we...@weirdnet.nl wrote: On Tue, Jun 02, 2015 at 07:33:58PM +, Stefan wrote: | http://www.openbsd.org/faq/faq8.html#Journaling Right, that doesn't help, it's not a tip for someone interested in *developing a journaling system for UFS*... You can rest assured they're already aware that OpenBSD doesn't support journaling. | On Tue, Jun 2, 2015, 2:31 PM Walter Neto wsouz...@gmail.com wrote: | Hy.. | | I want to help OpenBSD developing a journaling system for UFS. | | Someone can give me a tip? Please have a look at this implementation from the FFS author: http://www.mckusick.com/softdep/suj.pdf I believe the code is available in FreeBSD. All it takes is porting. Before you start your work, keep in mind that there's no guarantee that it will be incorporated in OpenBSD. But, if you present your case with proper diffs, you should at least get some attention. Thank you Paul! I intend to help the OpenBSD project, cause is the OS wich I like best, and the feature is the one I need now. So, I will do my best! Good luck! Paul 'WEiRD' de Weerd -- [++-]+++.+++[---].+++[+ +++-].++[-]+.--.[-] http://www.weirdnet.nl/
Re: Journal Implementation
On Jun 2, 2015, at 5:03 PM, Paul de Weerd we...@weirdnet.nl wrote: On Tue, Jun 02, 2015 at 07:33:58PM +, Stefan wrote: | http://www.openbsd.org/faq/faq8.html#Journaling Right, that doesn't help, it's not a tip for someone interested in *developing a journaling system for UFS*... You can rest assured they're already aware that OpenBSD doesn't support journaling. | On Tue, Jun 2, 2015, 2:31 PM Walter Neto wsouz...@gmail.com wrote: | Hy.. | | I want to help OpenBSD developing a journaling system for UFS. | | Someone can give me a tip? Please have a look at this implementation from the FFS author: http://www.mckusick.com/softdep/suj.pdf I believe the code is available in FreeBSD. All it takes is porting. Before you start your work, keep in mind that there's no guarantee that it will be incorporated in OpenBSD. But, if you present your case with proper diffs, you should at least get some attention. Thank you Paul! I intend to help the OpenBSD project, cause is the OS wich I like best, and the feature is the one I need now. So, I will do my best! Good luck! Paul 'WEiRD' de Weerd -- [++-]+++.+++[---].+++[+ +++-].++[-]+.--.[-] http://www.weirdnet.nl/