Re: [PATCH] [RFC] Add btrfs autosnap feature
Prior to making a new snapshot, grab the (stored) transid of the previous snapshot, and check if any files have been modified in the source since that transid: btrfs sub find ${source} ${previous_transid}. If nothing is returned, then you don't have to bother making the snapshot at all, otherwise after making the snapshot, grab the transid via btrfs sub find ${new_snapshot} -1, and store it some place (even a dot file in the root of the snapshot would work). there might be small window of time where transid and snapshot could be out of sync as we know them. since there is no atomic command which provides both - snapshot and transid. As in the example below. Assume tgw is the transaction group write which happens after we have read the transaction group id. --- sync; read current tran-id and compare (new tgw occurs) snapshot new tgw occurs sync; read current tran-id again and store --- which will result in failing to take snapshot even if there are changes. Certainly there will be some trade off, and below logic seems to be more safer... --- sync; read current tran-id and compare with previous new tgw occurs snapshot new tgw occurs store tran_id+2 (since tran_id gets added by two for a snapshot) --- which might have a situation where we have two identical snapshot. but a safer trade off. thanks, Anand -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RFC] Add btrfs autosnap feature
--- sync; read current tran-id and compare (new tgw occurs) snapshot new tgw occurs sync; read current tran-id again and store --- which will result in failing to take snapshot even if there are changes. btrfs sub find-new /snapshot- -1 shows the transid of the latest change of the snapshot, not the whole filesystem. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: getdents - ext4 vs btrfs performance
2012/3/4 Jacek Luczak difrost.ker...@gmail.com: 2012/3/3 Jacek Luczak difrost.ker...@gmail.com: 2012/3/2 Chris Mason chris.ma...@oracle.com: On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote: 2012/3/2 Chris Mason chris.ma...@oracle.com: On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote: I've took both on tests. The subject is acp and spd_readdir used with tar, all on ext4: 1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png 2) spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_ext4_readir.png 3) both: http://91.234.146.107/~difrost/seekwatcher/acp_vs_spd_ext4.png The acp looks much better than spd_readdir but directory copy with spd_readdir decreased to 52m 39sec (30 min less). Do you have stats on how big these files are, and how fragmented they are? For acp and spd to give us this, I think something has gone wrong at writeback time (creating individual fragmented files). How big? Which files? All the files you're reading ;) filefrag will tell you how many extents each file has, any file with more than one extent is interesting. (The ext4 crowd may have better suggestions on measuring fragmentation). Since you mention this is a compile farm, I'm guessing there are a bunch of .o files created by parallel builds. There are a lot of chances for delalloc and the kernel writeback code to do the wrong thing here. [Most of files are B and K size] All files scanned: 1978149 Files fragmented: 313 (0.015%) where 11 have 3+ extents Total size of fragmented files: 7GB (~13% of dir size) BTRFS: Non of files according to filefrag are fragmented - all fit into one extent. tar cf on fragmented files: 1) time: 7sec 2) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_fragmented.png 3) sw graph with spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_fragmented_spd.png 4) both on one: http://91.234.146.107/~difrost/seekwatcher/tar_fragmented_pure_spd.png BTRFS: tar on ext4 fragmented files 1) time: 6sec 2) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_fragmented_btrfs.png tar cf of fragmented files disturbed with [40,50) K files (in total 4373 files). K files before fragmented M files: 1) size: 7.2GB 2) time: 1m 14sec 3) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_disturbed.png 4) sw graph with spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_disturbed_spd.png 5) both on one: http://91.234.146.107/~difrost/seekwatcher/tar_disturbed_pure_spd.png BTRFS: tar on [40,50) K and ext4 fragmented 1) time: 56sec 2) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_disturbed_btrfs.png New test I've included - randomly selected files: - size 240MB 1) ext4 (time: 34sec) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_random_ext4.png 2) btrfs (time: 55sec) sw graph: http://91.234.146.107/~difrost/seekwatcher/tar_random_btrfs.png Yet another test. The original issue is in the directory data handling. In my case a lot of dirs are introduced due to extra .svn. Let's then see how does tar on those dirs looks like. Number of .svn directories: 61605 1) Ext4: - tar time: 10m 53sec - sw tar graph: http://91.234.146.107/~difrost/seekwatcher/svn_dir_ext4.png - sw tar graph with spd_readdir: http://91.234.146.107/~difrost/seekwatcher/svn_dir_spd_ext4.png 2) Btrfs: - tar time: 4m 35sec - sw tar graph: http://91.234.146.107/~difrost/seekwatcher/svn_dir_btrfs.png - sw tar graph with ext4: http://91.234.146.107/~difrost/seekwatcher/svn_dir_btrfs_ext4.png IMO this is not a writeback issue (well it could be but then it mean that it broken in general), it's not fragmentation. Sorting files in readdir helps a bit but is still far behind the btrfs. Any ideas? Is this a issue or the things are like they are and one need to live with it. -Jacek -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: getdents - ext4 vs btrfs performance
On Fri 02-03-12 14:32:15, Ted Tso wrote: On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote: It would be interesting to have a project where someone added fallocate() support into libelf, and then added some hueristics into ext4 so that if a file is fallocated to a precise size, or if the file is fully written and closed before writeback begins, that we use this to more efficiently pack the space used by the files by the block allocator. This is a place where I would not be surprised that XFS has some better code to avoid accelerated file system aging, and where we could do better with ext4 with some development effort. AFAIK XFS people actually prefer that applications let them do their work using delayed allocation and do not interfere with fallocate(2) calls. The problem they have with fallocate(2) is that it forces you to allocate blocks while with delayed allocation you can make the decision about allocation later. So for small files which completely fit into pagecache before they get pushed out by writeback, they can make better decisions from delayed allocation. Just dumping my memory from some other thread... Honza -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/19] btrfs: Convert to new freezing mechanism
We convert btrfs_file_aio_write() to use new freeze check. We also add proper freeze protection to btrfs_page_mkwrite(). Checks in cleaner_kthread() and transaction_kthread() can be safely removed since btrfs_freeze() will lock the mutexes and thus block the threads (and they shouldn't have anything to do anyway). CC: linux-btrfs@vger.kernel.org CC: Chris Mason chris.ma...@oracle.com Signed-off-by: Jan Kara j...@suse.cz --- fs/btrfs/disk-io.c |3 --- fs/btrfs/file.c|3 ++- fs/btrfs/inode.c |6 +- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 811d9f9..fc0f74c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1586,8 +1586,6 @@ static int cleaner_kthread(void *arg) struct btrfs_root *root = arg; do { - vfs_check_frozen(root-fs_info-sb, SB_FREEZE_WRITE); - if (!(root-fs_info-sb-s_flags MS_RDONLY) mutex_trylock(root-fs_info-cleaner_mutex)) { btrfs_run_delayed_iputs(root); @@ -1618,7 +1616,6 @@ static int transaction_kthread(void *arg) do { delay = HZ * 30; - vfs_check_frozen(root-fs_info-sb, SB_FREEZE_WRITE); mutex_lock(root-fs_info-transaction_kthread_mutex); spin_lock(root-fs_info-trans_lock); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 859ba2d..1aac7ca 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1348,7 +1348,7 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, ssize_t err = 0; size_t count, ocount; - vfs_check_frozen(inode-i_sb, SB_FREEZE_WRITE); + sb_start_write(inode-i_sb); mutex_lock(inode-i_mutex); @@ -1439,6 +1439,7 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, num_written = err; } out: + sb_end_write(inode-i_sb); current-backing_dev_info = NULL; return num_written ? num_written : err; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 32214fe..63c9006 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6405,6 +6405,7 @@ int btrfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) u64 page_start; u64 page_end; + sb_start_pagefault(inode-i_sb); ret = btrfs_delalloc_reserve_space(inode, PAGE_CACHE_SIZE); if (!ret) { ret = btrfs_update_time(vma-vm_file); @@ -6495,12 +6496,15 @@ again: unlock_extent_cached(io_tree, page_start, page_end, cached_state, GFP_NOFS); out_unlock: - if (!ret) + if (!ret) { + sb_end_pagefault(inode-i_sb); return VM_FAULT_LOCKED; + } unlock_page(page); out: btrfs_delalloc_release_space(inode, PAGE_CACHE_SIZE); out_noreserve: + sb_end_pagefault(inode-i_sb); return ret; } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/19] Fix filesystem freezing deadlocks
Hallelujah, after a couple of weeks and several rewrites, here comes the third iteration of my patches to improve filesystem freezing. Filesystem freezing is currently racy and thus we can end up with dirty data on frozen filesystem (see changelog patch 06 for detailed race description). This patch series aims at fixing this. To be able to block all places where inodes get dirtied, I've moved filesystem freeze handling in mnt_want_write() / mnt_drop_write(). This however required some code shuffling and changes to kern_path_create() (see patches 02-05). I think the result is OK but opinions may differ ;). The advantage of this change also is that all filesystems get freeze protection almost for free - even ext2 can handle freezing well now. Another potential contention point might be patch 19. In that patch we make freeze_super() refuse to freeze the filesystem when there are open but unlinked files which may be impractical in some cases. The main reason for this is the problem with handling of file deletion from fput() called with mmap_sem held (e.g. from munmap(2)), and then there's the fact that we cannot really force such filesystem into a consistent state... But if people think that freezing with open but unlinked files should happen, then I have some possible solutions in mind (maybe as a separate patchset since this is large enough). I'm not able to hit any deadlocks, lockdep warnings, or dirty data on frozen filesystem despite beating it with fsstress and bash-shared-mapping while freezing and unfreezing for several hours (using ext4 and xfs) so I'm reasonably confident this could finally be the right solution. And for people wanting to test - this patchset is based on patch series Push file_update_time() into .page_mkwrite so you'll need to pull that one in as well. Changes since v2: * completely rewritten * freezing is now blocked at VFS entry points * two stage freezing to handle both mmapped writes and other IO The biggest changes since v1: * have two counters to provide safe state transitions for SB_FREEZE_WRITE and SB_FREEZE_TRANS states * use percpu counters instead of own percpu structure * added documentation fixes from the old fs freezing series * converted XFS to use SB_FREEZE_TRANS counter instead of its private m_active_trans counter Honza CC: Alex Elder el...@kernel.org CC: Anton Altaparmakov an...@tuxera.com CC: Ben Myers b...@sgi.com CC: Chris Mason chris.ma...@oracle.com CC: cluster-de...@redhat.com CC: David S. Miller da...@davemloft.net CC: fuse-de...@lists.sourceforge.net CC: J. Bruce Fields bfie...@fieldses.org CC: Joel Becker jl...@evilplan.org CC: KONISHI Ryusuke konishi.ryus...@lab.ntt.co.jp CC: linux-btrfs@vger.kernel.org CC: linux-e...@vger.kernel.org CC: linux-...@vger.kernel.org CC: linux-ni...@vger.kernel.org CC: linux-ntfs-...@lists.sourceforge.net CC: Mark Fasheh mfas...@suse.com CC: Miklos Szeredi mik...@szeredi.hu CC: ocfs2-de...@oss.oracle.com CC: OGAWA Hirofumi hirof...@mail.parknet.co.jp CC: Steven Whitehouse swhit...@redhat.com CC: Theodore Ts'o ty...@mit.edu CC: x...@oss.sgi.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/19] btrfs: Push mnt_want_write() outside of i_mutex
When mnt_want_write() starts to handle freezing it will get a full lock semantics requiring proper lock ordering. So push mnt_want_write() call consistently outside of i_mutex. CC: Chris Mason chris.ma...@oracle.com CC: linux-btrfs@vger.kernel.org Signed-off-by: Jan Kara j...@suse.cz --- fs/btrfs/ioctl.c | 23 +++ 1 files changed, 11 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 03bb62a..c855e55 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -192,6 +192,10 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) if (!inode_owner_or_capable(inode)) return -EACCES; + ret = mnt_want_write_file(file); + if (ret) + return ret; + mutex_lock(inode-i_mutex); ip_oldflags = ip-flags; @@ -206,10 +210,6 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) } } - ret = mnt_want_write_file(file); - if (ret) - goto out_unlock; - if (flags FS_SYNC_FL) ip-flags |= BTRFS_INODE_SYNC; else @@ -271,9 +271,9 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) inode-i_flags = i_oldflags; } - mnt_drop_write_file(file); out_unlock: mutex_unlock(inode-i_mutex); + mnt_drop_write_file(file); return ret; } @@ -624,6 +624,10 @@ static noinline int btrfs_mksubvol(struct path *parent, struct dentry *dentry; int error; + error = mnt_want_write(parent-mnt); + if (error) + return error; + mutex_lock_nested(dir-i_mutex, I_MUTEX_PARENT); dentry = lookup_one_len(name, parent-dentry, namelen); @@ -635,13 +639,9 @@ static noinline int btrfs_mksubvol(struct path *parent, if (dentry-d_inode) goto out_dput; - error = mnt_want_write(parent-mnt); - if (error) - goto out_dput; - error = btrfs_may_create(dir, dentry); if (error) - goto out_drop_write; + goto out_dput; down_read(BTRFS_I(dir)-root-fs_info-subvol_sem); @@ -659,12 +659,11 @@ static noinline int btrfs_mksubvol(struct path *parent, fsnotify_mkdir(dir, dentry); out_up_read: up_read(BTRFS_I(dir)-root-fs_info-subvol_sem); -out_drop_write: - mnt_drop_write(parent-mnt); out_dput: dput(dentry); out_unlock: mutex_unlock(dir-i_mutex); + mnt_drop_write(parent-mnt); return error; } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: getdents - ext4 vs btrfs performance
On Mon, Mar 05, 2012 at 12:32:45PM +0100, Jacek Luczak wrote: 2012/3/4 Jacek Luczak difrost.ker...@gmail.com: 2012/3/3 Jacek Luczak difrost.ker...@gmail.com: 2012/3/2 Chris Mason chris.ma...@oracle.com: On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote: 2012/3/2 Chris Mason chris.ma...@oracle.com: On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote: I've took both on tests. The subject is acp and spd_readdir used with tar, all on ext4: 1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png 2) spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_ext4_readir.png 3) both: http://91.234.146.107/~difrost/seekwatcher/acp_vs_spd_ext4.png The acp looks much better than spd_readdir but directory copy with spd_readdir decreased to 52m 39sec (30 min less). Do you have stats on how big these files are, and how fragmented they are? For acp and spd to give us this, I think something has gone wrong at writeback time (creating individual fragmented files). How big? Which files? All the files you're reading ;) filefrag will tell you how many extents each file has, any file with more than one extent is interesting. (The ext4 crowd may have better suggestions on measuring fragmentation). Since you mention this is a compile farm, I'm guessing there are a bunch of .o files created by parallel builds. There are a lot of chances for delalloc and the kernel writeback code to do the wrong thing here. [Most of files are B and K size] All files scanned: 1978149 Files fragmented: 313 (0.015%) where 11 have 3+ extents Total size of fragmented files: 7GB (~13% of dir size) Ok, so I don't have a lot of great new ideas. My guess is that inode order and disk order for the blocks aren't matching up. You can confirm this with: acp -b some_dir You can also try telling acp to make a bigger read ahead window: acp -s 4096 -r 128 some_dir You can tell acp to scan all the files in the directory tree first (warning, this might use a good chunk of ram) acp -w some_dir and you can combine all of these together None of the above will actually help in your workload, but it'll help narrow down what is actually seeky on disk. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Understanding metadata efficiency of btrfs
I've run a little wired benchmark on comparing Btrfs v0.19 and XFS: There are 2000 directories and each directory contains 1000 files. The workload randomly stat a file or chmod a file for 200 times. And the number of stat and chmod are 50% and 50%. I monitor the number of disk read requests #Disk Write Requests, #Disk Read Requests, #Disk Write Sectors, #Disk Read Sectors Btrfs 2403520 157118329249216 13512248 XFS 62549339608010302718 4932800 I found the number of write quests of Btrfs is significant larger than XFS. I am not quite familiar with how btrfs commits the metadata change into the disks. From the website, it is said that btrfs uses COW B-tree which never overwrite previous disk pages. I assume that Btrfs also keep an in-memory buffer to keep the metadata changes. But it is unclear to me that how often Btrfs will commit these changes and what is the behind mechanism. Could anyone please comment on the experiment results and give a brief explanation of Btrfs's metadata committing mechanism? Sincerely, Kai Ren-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Understanding metadata efficiency of btrfs
Kai Ren posted on Mon, 05 Mar 2012 21:16:34 -0500 as excerpted: I've run a little wired benchmark on comparing Btrfs v0.19 and XFS: [snip description of test] I monitor the number of disk read requests #WriteRq #ReadRq #WriteSect #ReadSect Btrfs 2403520 157118329249216 13512248 XFS 625493 396080103027184932800 I found the number of write quests of Btrfs is significant larger than XFS. I am not quite familiar with how btrfs commits the metadata change into the disks. From the website, it is said that btrfs uses COW B-tree which never overwrite previous disk pages. I assume that Btrfs also keep an in-memory buffer to keep the metadata changes. But it is unclear to me that how often Btrfs will commit these changes and what is the behind mechanism. Could anyone please comment on the experiment results and give a brief explanation of Btrfs's metadata committing mechanism? First... You mentioned the web site, but didn't specify which one. FWIW, the kernel.org breakin of some months ago threw a monkey wrench in a lot of things, one of them being the btrfs wiki. The official btrfs.wiki.kernel.org site is currently a static copy of the wiki from before the breakin, so while it has the general btrfs ideas which haven't changed from back then, current status, etc, is now rather stale. But there's a temporary (that could end up being permanent, it's been months...) btrfs wiki that's MUCH more current, at: http://btrfs.ipv5.de/index.php?title=Main_Page So before going further, catch up with things on the current (temporary?) wiki. From your post, I'd suggest you read up a bit more than you have, because you failed to mention at all the most important metadata differences between the two filesystems. I'm not deep enough into filesystem internals to know if these facts explain the whole differences above; in fact, the wiki's where I got most of my btrfs specific info myself, but they certainly explain a good portion of it! The #1 biggest difference between btrfs and most other filesystems is that btrfs, by default, duplicates all metadata -- two copies of all metadata, one copy of data, by default. On a single disk/partition, that's called DUP mode, else it's referred to (not entirely correctly) as raid1 or raid10 mode depending on layout. (The not entirely correctly bit is because a true raid1 will have as many copies as there are active disks, while btrfs presently only does two-way mirroring. As such, with three plus disks, it's not proper raid1, only two-way-mirroring. 3-way and possibly N-way mirroring is on the roadmap for after raid5/6 support, which is roadmapped for kernels 3.4 or 3.5, so multi-way-mirroring is presumably 3.5 or 3.6.) It IS possible to setup only single-copy metadata, SINGLE mode, or two mirror data as well, but by default, btrfs keeps two copies of metadata, only one of data. So that doubles the btrfs metadata writes, right there, since by default, btrfs double-copies all metadata. The #2 big factor is that btrfs (again, by default, but this is a major feature of btrfs, otherwise, you might as well run something else) does full checksumming for both data and metadata. Unlike most filesystems, if cosmic rays or whatever start flipping bits on your data, btrfs will catch that, and if possible, retrieve a correct copy from elsewhere. This is actually one of the reasons for dual-copy metadata... and data too if you configure btrfs for it -- if the one copy is bad (fails the checksum validation) and there's another copy, btrfs will try to use it, instead. And of course all these checksums must be written somewhere as well, so that's another huge increase in written metadata, even for 0-length files, since the metadata itself is checksummed! And the checksumming goes some way toward explaining all those extra reads, as well, as any sysadmin who has run raid5/6 against raid1 can tell you, because in ordered to write out the new checksums, unchanged (meta)data must be read in, and on btrfs, existing checksums read in and verified as well, to make sure the existing version is valid, before making the change and writing it back out. As I said, I don't know if this explains /all/ the difference that you're seeing, but it should be quite plain that the btrfs double-metadata and integrity checking is going to be MULTIPLE TIMES more work and I/O than what more traditional filesystems such as the xfs you're comparing against must do. That's all covered in the wiki, actually, both of them, since those are btrfs basics that haven't changed (except the multi-way-mirroring roadmap) in some time. That they're such big factors and that you didn't mention them at all, indicates to me that you've quite some reading to do about btrfs, since they're so very basic to what makes it what it is. Otherwise, you might as well just be using some other filesystem instead, especially since
[PATCH 1/2] Make find_updated_files to return value instead of printing
From: Anand Jain anand.j...@oracle.com This patch made the function find_updated_files to update the transid in a pointer instead of printing it on the stdout. This is needed by the autosnap and anyother program which may want to find the current transid. Note that when last_gen 3rd parameter is not -1 then find_updated_files might still print the values on the stdout. Signed-off-by: Anand Jain anand.j...@oracle.com --- btrfs-list.c |4 ++-- btrfs_cmds.c |5 - btrfs_cmds.h |2 +- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index 61eddf9..6b642fb 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -872,7 +872,7 @@ static int print_one_extent(int fd, struct btrfs_ioctl_search_header *sh, return 0; } -int find_updated_files(int fd, u64 root_id, u64 oldest_gen) +int find_updated_files(int fd, u64 root_id, u64 oldest_gen, u64 *transid) { int ret; struct btrfs_ioctl_search_args args; @@ -969,7 +969,7 @@ int find_updated_files(int fd, u64 root_id, u64 oldest_gen) } free(cache_dir_name); free(cache_full_name); - printf(transid marker was %llu\n, (unsigned long long)max_found); + *transid = max_found; return ret; } diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 7aab105..9357305 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -275,6 +275,7 @@ int do_find_newer(int argc, char **argv) int ret; char *subvol; u64 last_gen; + u64 *tranid; subvol = argv[1]; last_gen = atoll(argv[2]); @@ -294,9 +295,11 @@ int do_find_newer(int argc, char **argv) fprintf(stderr, ERROR: can't access '%s'\n, subvol); return 12; } - ret = find_updated_files(fd, 0, last_gen); + ret = find_updated_files(fd, 0, last_gen, tranid); if (ret) return 19; + + printf(transid marker was %llu\n, (unsigned long long)*tranid); return 0; } diff --git a/btrfs_cmds.h b/btrfs_cmds.h index f53c113..218ed20 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -35,7 +35,7 @@ int do_set_default_subvol(int nargs, char **argv); int do_get_default_subvol(int nargs, char **argv); int list_subvols(int fd, int print_parent, struct sv_list **head, char *mnt); int do_df_filesystem(int nargs, char **argv); -int find_updated_files(int fd, u64 root_id, u64 oldest_gen); +int find_updated_files(int fd, u64 root_id, u64 oldest_gen, u64 *transid); int do_find_newer(int argc, char **argv); int do_change_label(int argc, char **argv); int open_file_or_dir(const char *fname); -- 1.7.9.2.315.g25a78 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Use transaction id to determin if there is any change in the subvol
From: Anand Jain anand.j...@oracle.com Moved from hash method of determining the FS changes to the transaction record id method Signed-off-by: Anand Jain anand.j...@oracle.com --- autosnap.c | 106 ++-- autosnap.h |4 +-- 2 files changed, 70 insertions(+), 40 deletions(-) diff --git a/autosnap.c b/autosnap.c index beddf68..1adaf01 100644 --- a/autosnap.c +++ b/autosnap.c @@ -45,7 +45,7 @@ /* during run time if not the below we use /var/spool/cron; */ char cron_path[]=/var/spool/cron/crontabs; char autosnap_conf_file[]=/etc/autosnap/config; -char tmp_file[]=/etc/autosnap/tmpfile; +//char tmp_file[]=/etc/autosnap/tmpfile; /* Take a snapshot with the default dest and adds attributes */ @@ -59,10 +59,10 @@ int do_autosnap_now(int argc, char **argv) char**ap; charsubvol[BTRFS_VOL_NAME_MAX]; charsspath[BTRFS_VOL_NAME_MAX + 128]; - chartag[100]; - charnew_hash[65]; + chartag[TAG_MAX_LEN]; + u64 cur_tranid = 0; + u64 ss_tranid = 0; char*mnt; - FILE*fp; u8 fsid[BTRFS_FSID_SIZE]; struct stat sb; struct rpolicy_cfg rp; @@ -101,6 +101,7 @@ int do_autosnap_now(int argc, char **argv) return -1; fd = open_file_or_dir(mnt); get_fsid(fd,fsid[0]); + close(fd); if ((res = read_config(subvol+strlen(mnt),tag,rp,NULL,fsid[0])) == 1) { fprintf(stderr,need to run autosnap enable for this subvol and tag pair\n); return 1; @@ -109,28 +110,46 @@ int do_autosnap_now(int argc, char **argv) return 1; } + /* Check if there is any change in the FS by comparing the transaction id*/ + if (strcmp(rp.idcal, older) == 0 ) { + /* Sync Subvol*/ + a[1] = subvol; + ap = a; + res = do_fssync(1, ap); + if(res != 0) { + return -1; + } + fd = open_file_or_dir(subvol); + if (fd 0) { + fprintf(stderr, ERROR: can't access '%s'\n, subvol); + return -1; + } + res = find_updated_files(fd, 0, -1, cur_tranid); + close(fd); + if (res) + return -1; + + if((stat(rp.last_ss, sb) == 0) (rp.last_ss_tranid == cur_tranid)) { + printf(FS is identical to the last snapshot. Aborting.\n); + return -1; + } + } + if ( take_autosnap(subvol, tag, sspath) !=0 ) return -1; - if (strcmp(rp.idcal, older) == 0 ) { - fp = fopen(tmp_file, w); - tree_scan(sspath, fp); - fclose(fp); - get_sha256(tmp_file, new_hash); - if((stat(rp.last_ss, sb) == 0) (strcmp(rp.last_ss_hash,new_hash) == 0)) { - printf(Newer snapshot is identical to the previous snapshot, deleting the newer\n); - a[1] = sspath; - ap = a; - res = do_delete_subvolume(2,ap); - if(res) - printf(do_delete_subvolume failed %d\n,res); - } else { - /* hash does not match so keep the new snasphot OR - Last snapshot was deleted. */ - update_last_hash(subvol+strlen(mnt),tag,fsid[0],sspath,new_hash); - } - unlink(tmp_file); + fd = open_file_or_dir(sspath); + if (fd 0) { + fprintf(stderr, ERROR: can't access '%s'\n, sspath); + return -1; } + res = find_updated_files(fd, 0, -1, ss_tranid); + close(fd); + if (res) + return -1; + + /* tranid does not match or Last snapshot was deleted. go ahead*/ + update_last_tranid(subvol+strlen(mnt),tag,fsid[0],sspath,ss_tranid); #if 0 /* Un-def this when we have synchronous snapshot delete */ @@ -141,7 +160,8 @@ int do_autosnap_now(int argc, char **argv) if (rp.rpval != -1) { res = chk_retain_bynum(subvol, rp.rpval, tag); if(res != 0 ) { - fprintf(stderr,Error: Check for the retainable subvol failed %d\n,res); + fprintf(stderr,Error: Check for the retainable subvol failed %d\n, + res); return -1; } } @@ -457,7 +477,8 @@ int do_autosnap_enable(int argc, char **argv) case 'm': fcnt++; if ((atoi(optarg) 60) || (atoi(optarg) 1)) { - fprintf(stderr, Value for option -m: Minutes should be between 1 to 60\n); +