On Thu, Jul 13, 2017 at 8:01 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: > > > On 2017年07月14日 07:26, Chris Murphy wrote:
>> No benchmarking comparison but it's known that deletion of snapshots >> gets more expensive when there are many snapshots due to backref >> search and metadata updates. I have no idea how it compares to >> overlayfs. But then also some use cases I guess it's non-trivial >> benefit to leverage a shared page cache. > > > In fact, except balance and quota, I can't see much extra performance impact > from backref walk. > > And if it's not snapshots, but subvolumes, then more subvolumes means > smaller subvolume trees, and less race to lock subvolume trees. > So, more (evenly distributed) subvolumes should in fact lead to higher > performance. Interesting. > >> >>> Btrfs + overlayfs? The copy-up coperation in overlayfs can take >>> advantage of btrfs's clone, but this benefit applies for xfs, too. >> >> >> Btrfs supports fs shrink, and also multiple device add/remove so it's >> pretty nice for managing its storage in the cloud. And also seed >> device might have uses. Some of it is doable with LVM but it's much >> simpler, faster and safer with Btrfs. > > > Faster? Not really. > For metadata operation, btrfs is slower than traditional FSes. The equivalent of Btrfs multiple device 'btrfs dev delete' to remove a device, and migrate bg's to remaining devices, is really slow on LVM using pvmove. Plus you are allowed to delete devices that haven't had pvmove applied, so data loss is possible (user induced data loss). > Due to metadata CoW, any metadata update will lead to superblock update. > Such extra FUA for superblock is specially obvious for fsync heavy load but > low concurrency case. > Not to mention its default data CoW will lead to metadata CoW, making things > even slower. > > And race to lock fs/subvolume trees makes metadata operation even slower, > especially for multi-thread IO. > Unlike other FSes which use one-tree-one-inode, btrfs uses > one-tree-one-subvoume, which makes race much hotter. OK so this possibly means overlayfs might make things slower since all I/O ends up getting dumped into one Btrfs fstree; whereas with Docker using Btrfs rw snapshots, it puts each container's I/O into its own subvolume. > > Extent tree used to have the same problem, but delayed-ref (no matter you > like it or not) did reduce race and improved performance. > > IIRC, some postgresql benchmark shows that XFS/Ext4 with LVM-thin provide > much better performance than Btrfs, even ZFS-on-Linux out-performs btrfs. OK. >> And that's why I'm kinda curious about the combination of Btrfs and >> overlayfs. Overlayfs managed by Docker. And Btrfs for simpler and more >> flexible storage management. > > Despite the performance problem, (working) btrfs does provide flex and > unified management. > > So implementing shared page cache in btrfs will eliminate the necessary for > overlayfs. :) > Just kidding, such support need quite a lot of VFS and MM modification, and > I don't know if we will be able to implement it at all. Yeah I've read it's complicated for everyone, even overlayfs folks have had growing pains. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html