On 02/17/2016 11:58 PM, Eric Blake wrote:
> On 02/17/2016 11:10 AM, Denis V. Lunev wrote:
>> This patch proposes a new command to reduce the amount of data passed
>> through the wire when it is known that the data is all zeroes. This
>> functionality is generally useful for mirroring or backup operations.
>>
>> Currently available NBD_CMD_TRIM command can not be used as the
>> specification explicitely says that "a client MUST NOT make any
> s/explicitely/explicitly/
>
>> assumptions about the contents of the export affected by this
>> [NBD_CMD_TRIM] command, until overwriting it again with `NBD_CMD_WRITE`"
>>
>> Particular use case could be the following:
>>
>> QEMU project uses own implementation of NBD server to transfer data
>> in between different instances of QEMU. Typically we tranfer VM virtual
> s/tranfer/transfer/
>
>> disks over this channel. VM virtual disks are sparse and thus the
>> efficiency of backup and mirroring operations could be improved a lot.
>>
>> Signed-off-by: Denis V. Lunev <[email protected]>
>> ---
>> doc/proto.md | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/doc/proto.md b/doc/proto.md
>> index 43065b7..c94751a 100644
>> --- a/doc/proto.md
>> +++ b/doc/proto.md
>> @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle
>> negotiation:
>> schedule I/O accesses as for a rotational medium
>> - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
>> `NBD_CMD_TRIM` commands
>> +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
>> + supports `NBD_CMD_WRITE_ZEROES` commands
>>
>> ##### Client flags
>>
>> @@ -446,6 +448,11 @@ The following request types exist:
>> about the contents of the export affected by this command, until
>> overwriting it again with `NBD_CMD_WRITE`.
>>
>> +* `NBD_CMD_WRITE_ZEROES` (6)
>> +
>> + A request to write zeroes. The command is functional equivalent of
>> + the NBD_WRITE_COMMAND but without payload sent through the channel.
> This lets us push holes during writes.
from my point this allows client to apply his policy. For QCOW2 output
target the
client could skip the block. For RAW file he could decide whether to use
UNMAP
and produce sparse file or use fallocate.
> Do we have the converse
> operation, that is, an easy way to query if a block of data will read as
> all zeroes, and therefore the client can bypass reading that portion of
> the disk (in other words, an equivalent to lseek(SEEK_HOLE/SEEK_DATA))?
>
exactly!
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
...
ret = bdrv_get_block_status_above(source, NULL, sector_num,
<------- query block state
nb_sectors, &pnum, &file);
if (ret < 0 || pnum < nb_sectors ||
(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
} else if (ret & BDRV_BLOCK_ZERO) {
bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
<------ skip read op if allowed
s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
mirror_write_complete, op);
} else {
assert(!(ret & BDRV_BLOCK_DATA));
bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
}
return delay_ns;
Actually I have tried early at day begins to add .bdrv_co_write_zeroes
callback to NBD and it just works as expected. The problem is that
callback can not be written using NDB_SEND_TRIM to conform with the
NBD spec. But in QEMU -> QEMU communication it just works.
http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg03810.html
Den
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Nbd-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nbd-general