Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
On Thu, Mar 20, 2014 at 08:50:27AM +0530, Aneesh Kumar K.V wrote: > > On Tue, Mar 18, 2014 at 01:48:00PM +0630, chandan wrote: > >> The earlier patchset posted by Chandra Seethraman was to get 4k > >> blocksize to work with ppc64's 64k PAGE_SIZE. > > > > Are we talking about metadata block sizes or data block sizes? > > > >> The root node of "tree root" tree has 1957 bytes being written by > >> make_btrfs() (in btrfs-progs). Hence I chose to do 2k blocksize for > >> the initial subpagesize-blocksize work. So with this patchset the > >> supported blocksizes would be in the range 2k-64k. > > > > So it's metadata blocks, and in this case 2k looks like the only > > allowed size that's smaller than 4k, and thus can demonstrage sub-page > > size allocations. I'm not sure if this is limiting for potential future > > extensions of metadata structures that could be larger. > > > > 2k is ok for testing purposes, but I think a 4k-page machine will hardly > > use a smaller page size. The more that 16k metadata blocks are now > > default. > > The goal is to remove the assumption that supported blocks size is >= page > size. The primary reason to do that is to support migration of disk > devices across different architectures. If we have a btrfs disk created > on x86 box with data blocksize 4K and meta data block size 16K we should > make sure that, the disk can be read/written from ppc64 box (which have a page > size of 64K). To enable easy testing and community development we are > now focusing on achieving 2K data blocksize and 2K meata data block size > on x86. As you said this will never be used in production. > > To achieve that we did the below > *) Add offset and len to btrfs_io_bio. These are file offsets and > len. This is later used to unlock extent io tree. > > *) Now we also need to make sure that submit_extent_page only submit > contiguous range in the file offset range. ie if we have holes in > between we split them into two submit_extent_page. This ensures that > btrfs_io_bio offset and len represent a contiguous range. > > Please let us know whether the above approach is acceptable. I don't see any apparent problem with this approach. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
David Sterba writes: > On Tue, Mar 18, 2014 at 01:48:00PM +0630, chandan wrote: >> The earlier patchset posted by Chandra Seethraman was to get 4k >> blocksize to work with ppc64's 64k PAGE_SIZE. > > Are we talking about metadata block sizes or data block sizes? > >> The root node of "tree root" tree has 1957 bytes being written by >> make_btrfs() (in btrfs-progs). Hence I chose to do 2k blocksize for >> the initial subpagesize-blocksize work. So with this patchset the >> supported blocksizes would be in the range 2k-64k. > > So it's metadata blocks, and in this case 2k looks like the only > allowed size that's smaller than 4k, and thus can demonstrage sub-page > size allocations. I'm not sure if this is limiting for potential future > extensions of metadata structures that could be larger. > > 2k is ok for testing purposes, but I think a 4k-page machine will hardly > use a smaller page size. The more that 16k metadata blocks are now > default. The goal is to remove the assumption that supported blocks size is >= page size. The primary reason to do that is to support migration of disk devices across different architectures. If we have a btrfs disk created on x86 box with data blocksize 4K and meta data block size 16K we should make sure that, the disk can be read/written from ppc64 box (which have a page size of 64K). To enable easy testing and community development we are now focusing on achieving 2K data blocksize and 2K meata data block size on x86. As you said this will never be used in production. To achieve that we did the below *) Add offset and len to btrfs_io_bio. These are file offsets and len. This is later used to unlock extent io tree. *) Now we also need to make sure that submit_extent_page only submit contiguous range in the file offset range. ie if we have holes in between we split them into two submit_extent_page. This ensures that btrfs_io_bio offset and len represent a contiguous range. Please let us know whether the above approach is acceptable. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
On Tue, Mar 18, 2014 at 01:48:00PM +0630, chandan wrote: > The earlier patchset posted by Chandra Seethraman was to get 4k > blocksize to work with ppc64's 64k PAGE_SIZE. Are we talking about metadata block sizes or data block sizes? > The root node of "tree root" tree has 1957 bytes being written by > make_btrfs() (in btrfs-progs). Hence I chose to do 2k blocksize for > the initial subpagesize-blocksize work. So with this patchset the > supported blocksizes would be in the range 2k-64k. So it's metadata blocks, and in this case 2k looks like the only allowed size that's smaller than 4k, and thus can demonstrage sub-page size allocations. I'm not sure if this is limiting for potential future extensions of metadata structures that could be larger. 2k is ok for testing purposes, but I think a 4k-page machine will hardly use a smaller page size. The more that 16k metadata blocks are now default. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
Hello David, > I looked at previous postings of this patchset, but haven't found what > are the expected supported block sizes. > > I assume powers of two starting with 512b, until 64k. The earlier patchset posted by Chandra Seethraman was to get 4k blocksize to work with ppc64's 64k PAGE_SIZE. I chose to do 2k blocksize on x86_64's 4k PAGE_SIZE since that would allow others in the community to work/experiment with subpagesize-blocksize feature. The root node of "tree root" tree has 1957 bytes being written by make_btrfs() (in btrfs-progs). Hence I chose to do 2k blocksize for the initial subpagesize-blocksize work. So with this patchset the supported blocksizes would be in the range 2k-64k. Thanks, chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
On Wed, Mar 12, 2014 at 07:50:27PM +0530, Chandan Rajendra wrote: > The patchset has been tested with only a handful of trivial I/O tests i.e. I/O > on different kinds of files (e.g. files with holes, etc). The tests were run > on 2k and 4k blocksized instances of Btrfs code (on x86_64) that has initial > support for mounting a 2k blocksized Btrfs filesystem. > 3. Compression does not work with 2k blocksize filesystem instance. I looked at previous postings of this patchset, but haven't found what are the expected supported block sizes. I assume powers of two starting with 512b, until 64k. I'm asking because the compression container would need to take this into account. (I originally had only >= 4k sizes in mind.) thanks, david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
This patchset describes a possible solution to solve whole page I/O issue present in Btrfs. Currently whole page I/O is being performed because bio_vec->[bv_len, bv_offset] can be modified by the Block I/O layer and hence cannot be used by Btrfs's endio functions to operate on the original file offset range that the bio was intended for. This patch is a precursor to get subpagesize-blocksize Btrfs feature to work correctly as it would be required to do non-whole page I/O. The patchset makes use of patches posted earlier by Chandra Seetharaman (http://article.gmane.org/gmane.comp.file-systems.btrfs/30737). I believe that getting non whole page I/O to work properly is the first issue to tackle before working on the review comments posted for Chandra's patches. Please correct me if my assumption is incorrect. The patchset has been tested with only a handful of trivial I/O tests i.e. I/O on different kinds of files (e.g. files with holes, etc). The tests were run on 2k and 4k blocksized instances of Btrfs code (on x86_64) that has initial support for mounting a 2k blocksized Btrfs filesystem. The corresponding git trees can be found at https://github.com/chandanr/btrfs-progs/tree/subpagesize-blocksize and https://github.com/chandanr/linux/tree/btrfs/subpagesize-blocksize The modified code is limited in feature i.e. the following are its known limitations, 1. xfstests suite has not been executed. 2. Direct I/O has not been tested. 3. Compression does not work with 2k blocksize filesystem instance. 4. fallocate does not work with 2k blocksize. 5. Data checksum does not work for 2k blocksize. Hence filesystem should be mounted by passing the nodatasum option. 6. Performance implications are unknown. Chandan Rajendra (4): Btrfs: subpagesize-blocksize: Get rid of whole page reads. Btrfs: subpagesize-blocksize: Get rid of whole page writes. Btrfs: subpagesize-blocksize: Work with extents aligned to blocksize. Btrfs: subpagesize-blocksize: Hardcode MAX_EXTENT_BUFFERS_PER_PAGE to 2. Chandra Seetharaman (2): Btrfs: subpagesize-blocksize: Define extent_buffer_head Btrfs: subpagesize-blocksize: Allow mounting filesystems where sectorsize != PAGE_SIZE fs/btrfs/backref.c | 2 +- fs/btrfs/ctree.c | 2 +- fs/btrfs/ctree.h | 6 +- fs/btrfs/disk-io.c | 117 fs/btrfs/extent-tree.c | 6 +- fs/btrfs/extent_io.c | 623 --- fs/btrfs/extent_io.h | 60 - fs/btrfs/file.c | 13 +- fs/btrfs/volumes.c | 2 +- fs/btrfs/volumes.h | 3 + include/trace/events/btrfs.h | 2 +- 11 files changed, 554 insertions(+), 282 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html