Re: Disk space accounting and subvolume delete

2010-05-31 Thread Bruce Guenter
On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng  wrote:
 Dropping a tree can be lengthy. It's not good to let sync wait for hours.
 For most linux FS, 'sync' just force an transaction/journal commit. I don't
 think they wait for large operations that can span multiple transactions to
 complete.

What happens to the consistency of the filesystem if a crash happens
during this process?

-- 
Bruce Guenter br...@untroubled.orghttp://untroubled.org/


pgpqLq7kXRPuB.pgp
Description: PGP signature


Re: Disk space accounting and subvolume delete

2010-05-31 Thread Mike Fedyk
On Mon, May 31, 2010 at 12:01 PM, Bruce Guenter br...@untroubled.org wrote:
 On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng  wrote:
 Dropping a tree can be lengthy. It's not good to let sync wait for hours.
 For most linux FS, 'sync' just force an transaction/journal commit. I don't
 think they wait for large operations that can span multiple transactions to
 complete.

 What happens to the consistency of the filesystem if a crash happens
 during this process?

There's a good test case for you to try.  Let us know what you find.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disk space accounting and subvolume delete

2010-05-31 Thread Yan, Zheng
On Tue, Jun 1, 2010 at 3:01 AM, Bruce Guenter br...@untroubled.org wrote:
 On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng  wrote:
 Dropping a tree can be lengthy. It's not good to let sync wait for hours.
 For most linux FS, 'sync' just force an transaction/journal commit. I don't
 think they wait for large operations that can span multiple transactions to
 complete.

 What happens to the consistency of the filesystem if a crash happens
 during this process?


This does not break the consistency of the filesystem. Next mount will find the
partial dropped tree and restart the dropping process.

Yan, Zheng
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disk space accounting and subvolume delete

2010-05-12 Thread Mike Fleetwood
On 12 May 2010 06:02, Yan, Zheng yanzh...@21cn.com wrote:
 On Tue, May 11, 2010 at 11:45 PM, Bruce Guenter br...@untroubled.org wrote:
 On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng  wrote:
 This is because the snapshot deleting ioctl only removes the a link.

 Right, I understand that.  That part is not unexpected, as it works just
 like unlink would.  However...

 The corresponding tree is dropped in the background by a kernel thread.

 The surprise is that 'sync', in any form I was able to try, does not
 wait until all or even most of the I/O is completed.  Apparently the
 standards spec for sync(2) says it is not required to wait for I/O to
 complete, but AFAIK all other Linux FS do wait (the man page for sync(2)
 implies as much, as does the info page for sync in glibc).

 The only way I've found so far to force this behavior is to unmount, and
 that's rather intrusive to other users of the FS.

 We could probably add another ioctl that waits until the tree has been
 completely dropped.

 Since the expected behavior for sync is to wait until all pending I/O
 has been completed, I would argue this should be the default action for
 sync.  Am I misunderstanding something?


 Dropping a tree can be lengthy. It's not good to let sync wait for hours.
 For most linux FS, 'sync' just force an transaction/journal commit. I don't
 think they wait for large operations that can span multiple transactions to
 complete.

Disclaimer: I know nothing about the internals of Btrfs!

I have an analogy as a way to thinking about what deleting a snapshot
entails (which I hope isn't totally bogus).

Deleting a clone of a file system is not like unlinking a single file.
 It is analogous to deleting a directory tree.  Syncing in the middle
of a recursive delete will wait for the in flight I/O to complete, but
it would not wait for the unlink requests from the portion of the
directory tree not yet traversed.  The same would be true when the
kernel thread deletes the snapshot by recursing through it's tree.

Mike
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disk space accounting and subvolume delete

2010-05-11 Thread Yan, Zheng
On Tue, May 11, 2010 at 11:45 PM, Bruce Guenter br...@untroubled.org wrote:
 On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng  wrote:
 This is because the snapshot deleting ioctl only removes the a link.

 Right, I understand that.  That part is not unexpected, as it works just
 like unlink would.  However...

 The corresponding tree is dropped in the background by a kernel thread.

 The surprise is that 'sync', in any form I was able to try, does not
 wait until all or even most of the I/O is completed.  Apparently the
 standards spec for sync(2) says it is not required to wait for I/O to
 complete, but AFAIK all other Linux FS do wait (the man page for sync(2)
 implies as much, as does the info page for sync in glibc).

 The only way I've found so far to force this behavior is to unmount, and
 that's rather intrusive to other users of the FS.

 We could probably add another ioctl that waits until the tree has been
 completely dropped.

 Since the expected behavior for sync is to wait until all pending I/O
 has been completed, I would argue this should be the default action for
 sync.  Am I misunderstanding something?


Dropping a tree can be lengthy. It's not good to let sync wait for hours.
For most linux FS, 'sync' just force an transaction/journal commit. I don't
think they wait for large operations that can span multiple transactions to
complete.

Yan, Zheng
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Disk space accounting and subvolume delete

2010-05-10 Thread Bruce Guenter
Hi.

When deleting a snapshot, I have observed that the disk space used by
that snapshot is not immediately released (according to statvfs or df).
Neither sync nor btrfs filesystem sync releases the disk space
neither.  The only way I have found to actually fully release the disk
space is to issue the sync and then sleep until the statvfs free numbers
stop changing.

This is a rather problematic approach to managing disk space.  Is there
any way to either force a wait until the disk space has been released?

My application is automatically managing disk space in the presence of
snapshots.  I allow the disk (a backup) to fill up with snapshots until
it is nearly full, and then to delete snapshots until I have a threshold
free.  However, without the disk space being released promptly and no
way to wait until it is released, the loop can't tell how many snapshots
to delete.

-- 
Bruce Guenter br...@untroubled.orghttp://untroubled.org/


pgpN3ChYhM4QB.pgp
Description: PGP signature


Re: Disk space accounting and subvolume delete

2010-05-10 Thread Josef Bacik
On Mon, May 10, 2010 at 12:23:52PM -0600, Bruce Guenter wrote:
 Hi.
 
 When deleting a snapshot, I have observed that the disk space used by
 that snapshot is not immediately released (according to statvfs or df).
 Neither sync nor btrfs filesystem sync releases the disk space
 neither.  The only way I have found to actually fully release the disk
 space is to issue the sync and then sleep until the statvfs free numbers
 stop changing.
 
 This is a rather problematic approach to managing disk space.  Is there
 any way to either force a wait until the disk space has been released?
 
 My application is automatically managing disk space in the presence of
 snapshots.  I allow the disk (a backup) to fill up with snapshots until
 it is nearly full, and then to delete snapshots until I have a threshold
 free.  However, without the disk space being released promptly and no
 way to wait until it is released, the loop can't tell how many snapshots
 to delete.
 

The way BTRFS's COW works is that we can't free up space until after a
transaction has committed.  After the transaction commits (after a sync) we walk
the list of pinned extents and free them asynchronously.  We could probably make
btrfs filesystem sync wait for that part to finish tho.  It shouldn't be too
hard to do, feel free to take a crack at it.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html