Hi, here are some comments / man page updates for things I have learned in my adventures through msdosfs and vfs_bio.c. I have also added the bread_cluster function to buffercache(9). It would be nice if someone who knows the buffer/disk stuff could review this.
Hint for the paragraph on block sizes: In ufs/ffs/ffs_vfsops.c there is: sbp->f_iosize = fs->fs_bsize; Then look at the use of fs_bsize in ufs/ffs/fs.h Cheers, Stefan diff --git a/share/man/man9/buffercache.9 b/share/man/man9/buffercache.9 index 84df0c06513..0c4c8f1ad96 100644 --- a/share/man/man9/buffercache.9 +++ b/share/man/man9/buffercache.9 @@ -108,6 +108,7 @@ .Sh NAME .Nm buffercache , .Nm bread , +.Nm bread_cluster , .Nm breadn , .Nm bwrite , .Nm bawrite , @@ -126,6 +127,9 @@ .Fn bread "struct vnode *vp" "daddr_t blkno" "int size" \ "struct buf **bpp" .Ft int +.Fn bread_cluster "struct vnode *vp" "daddr_t blkno" "int size" \ +"struct buf **bpp" +.Ft int .Fn breadn "struct vnode *vp" "daddr_t blkno" "int size" \ "daddr_t rablks[]" "int rasizes[]" "int nrablks" \ "struct buf **bpp" @@ -163,6 +167,11 @@ In addition to describing a cached block, a .Em buf structure is also used to describe an I/O request as a part of the disk driver interface. +.Pp +The block size used for logical block numbers depends on the type of the +given vnode. +For file vnodes, this is f_iosize of the underlying filesystem. +For block device vnodes, this will usually be DEV_BSIZE. .\" XXX struct buf, B_ flags, MP locks, etc. .\" XXX free list, hash queue, etc. .\" ------------------------------------------------------------ @@ -184,6 +193,10 @@ to allocate a buffer with enough pages for .Fa size and reads the specified disk block into it. .Pp +.Fn bread +always returns a buffer, even if it returns an error due to an I/O +error. +.Pp The buffer returned by .Fn bread is marked as busy. @@ -222,6 +235,30 @@ and The read-ahead blocks aren't returned, but are available in cache for future accesses. .\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +.It Xo +.Fo bread_cluster +.Fa "vp" +.Fa "blkno" +.Fa "size" +.Fa "bpp" +.Fc +.Xc +Read a block of size +.Fa "size" +corresponding to +.Fa vp +and +.Fa blkno , +with readahead. +If neither the first block, nor a part of the next MAXBSIZE bytes is already +in the buffer cache, +.Fn bread_cluster +will perform a read-ahead of MAXBSIZE bytes in a single I/O operation. +This is currently more efficient than +.Fn breadn . +The read-ahead data isn't returned, but is available in cache for +future accesses. +.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - .It Fn bwrite "bp" Write a block. Start I/O for write using diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 88adfeff237..4804d370f49 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -568,6 +568,12 @@ bread_cluster_callback(struct buf *bp) } } +/* + * Read-ahead multiple disk blocks, but make sure only one (big) I/O + * request is sent to the disk. + * XXX This should probably be dropped and breadn should instead be optimized + * XXX to do fewer I/O requests. + */ int bread_cluster(struct vnode *vp, daddr_t blkno, int size, struct buf **rbpp) { @@ -1023,6 +1029,9 @@ geteblk(size_t size) /* * Allocate a buffer. + * If vp is given, put it into the buffer cache for that vnode. + * If size != 0, allocate memory and call buf_map(). + * If there is already a buffer for the given vnode/blkno, return NULL. */ struct buf * buf_get(struct vnode *vp, daddr_t blkno, size_t size)