Hi,
On Wed, Jan 23, 2013 at 1:04 PM, Chen Yang <[email protected]> wrote:
> From: Chen Yang <[email protected]>
> Date: Wed, 23 Jan 2013 11:21:51 +0800
> Subject: [PATCH] Btrfs/send: sparse and pre-allocated file support for
> btrfs-send mechanism
>
> When sending a file with sparse or pre-allocated part,
> these parts will be sent as ZERO streams, and it's unnecessary.
>
> There are two ways to improve this, one is just skip the EMPTY parts,
> and the other one is to add a punch command to send, when an EMPTY parts
> was detected. But considering a case of incremental sends, if we choose
> the first one, when a hole got punched into the file after the initial
> send, the data will be unchanged on the receiving side when received
> incrementally. So the second choice is right.
>
> Signed-off-by: Cheng Yang <[email protected]>
> ---
> fs/btrfs/send.c | 60
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> fs/btrfs/send.h | 3 +-
> 2 files changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index 5445454..31e9aef 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -3585,6 +3585,52 @@ out:
> return ret;
> }
>
> +static int send_punch(struct send_ctx *sctx, u64 offset, u32 len)
> +{
> + int ret = 0;
> + struct fs_path *p;
> + mm_segment_t old_fs;
> +
> + p = fs_path_alloc(sctx);
> + if (!p)
> + return -ENOMEM;
> +
> + /*
> + * vfs normally only accepts user space buffers for security reasons.
> + * we only read from the file and also only provide the read_buf
> buffer
> + * to vfs. As this buffer does not come from a user space call, it's
> + * ok to temporary allow kernel space buffers.
> + */
> + old_fs = get_fs();
> + set_fs(KERNEL_DS);
> +
> +verbose_printk("btrfs: send_fallocate offset=%llu, len=%d\n", offset, len);
> +
> + ret = open_cur_inode_file(sctx);
> + if (ret < 0)
> + goto out;
> +
> + ret = begin_cmd(sctx, BTRFS_SEND_C_PUNCH);
> + if (ret < 0)
> + goto out;
> +
> + ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, p);
> + if (ret < 0)
> + goto out;
> +
> + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
> + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
> + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
> +
> + ret = send_cmd(sctx);
> +
> +tlv_put_failure:
> +out:
> + fs_path_free(sctx, p);
> + set_fs(old_fs);
> + return ret;
> +}
> +
> /*
> * Read some bytes from the current inode/file and send a write command to
> * user space.
> @@ -3718,6 +3764,7 @@ static int send_write_or_clone(struct send_ctx *sctx,
> u64 pos = 0;
> u64 len;
> u32 l;
> + u64 bytenr;
> u8 type;
>
> ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
> @@ -3731,8 +3778,19 @@ static int send_write_or_clone(struct send_ctx *sctx,
> * sure to send the whole thing
> */
> len = PAGE_CACHE_ALIGN(len);
> - } else {
> + } else if (type == BTRFS_FILE_EXTENT_REG) {
> len = btrfs_file_extent_num_bytes(path->nodes[0], ei);
> + bytenr = btrfs_file_extent_disk_bytenr(path->nodes[0], ei);
> + if (bytenr == 0) {
> + ret = send_punch(sctx, offset, len);
> + goto out;
> + }
> + } else if (type == BTRFS_FILE_EXTENT_PREALLOC) {
> + len = btrfs_file_extent_num_bytes(path->nodes[0], ei);
> + ret = send_punch(sctx, offset, len);
> + goto out;
> + } else {
> + BUG();
> }
Are these two cases really the same? In the bytenr == 0 we want to
deallocate the range. While in the prealloc case, we want to ensure
disk space allocation. Or am I mistaken?
Looking at the receive side, you use the same command for both:
ret = fallocate(r->write_fd,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
offset, len);
Looking at btrfs_fallocate code, in that case it will always punch the hole:
static long btrfs_fallocate(struct file *file, int mode,
loff_t offset, loff_t len)
...
if (mode & FALLOC_FL_PUNCH_HOLE)
return btrfs_punch_hole(inode, offset, len);
So maybe you should have two different commands, or add a flag to
distinguish between the two cases?
Thanks,
Alex.
>
> if (offset + len > sctx->cur_inode_size)
> diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
> index 1bf4f32..659ac8f 100644
> --- a/fs/btrfs/send.h
> +++ b/fs/btrfs/send.h
> @@ -20,7 +20,7 @@
> #include "ctree.h"
>
> #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream"
> -#define BTRFS_SEND_STREAM_VERSION 1
> +#define BTRFS_SEND_STREAM_VERSION 2
>
> #define BTRFS_SEND_BUF_SIZE (1024 * 64)
> #define BTRFS_SEND_READ_SIZE (1024 * 48)
> @@ -80,6 +80,7 @@ enum btrfs_send_cmd {
> BTRFS_SEND_C_WRITE,
> BTRFS_SEND_C_CLONE,
>
> + BTRFS_SEND_C_PUNCH,
> BTRFS_SEND_C_TRUNCATE,
> BTRFS_SEND_C_CHMOD,
> BTRFS_SEND_C_CHOWN,
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html