On 12/22/2010 11:35 AM, Eric Blake wrote:
> Right now, we create qemu snapshots to a file by using an exec: monitor
> command that passes 'compressor | { dd && dd; }' with stdout connected
> to the target file. However, since dd is using a larger bs= than
> PIPE_MAX, it is conceivable that under heavy machine load, that dd will
> get a short read from the pipe, and unless we use the GNU extension of
> iflag=fullblock, that short read will be padded out to the output block
> size and result in data corruption in the destination file.
>
> The possibility of corruption due to short reads when dd is fed input
> through a pipe but produces output through a large block size has
> occurred several times on the coreutils mailing list:
> http://lists.gnu.org/archive/html/bug-coreutils/2010-11/msg00092.html
> Coreutils currently obeys the (non-intuitive) POSIX requirements for
> this short read behavior, and while it is being considered to make dd
> when POSIXLY_CORRECT is unset be more sensible, you can't rely on that
> being a default.
>
> Should we refuse to make snapshots if we don't detect the GNU extension
> of 'dd iflag=fullblock'? Probably safe to do on GNU/Linux machines, but
> I'm wondering if it will have negative effects on BSD machines where GNU
> coreutils is not installed.I've gone back and read POSIX again. It states: If the bs=expr operand is specified and no conversions other than sync, noerror, or notrunc are requested, the data returned from each input block shall be written as a separate output block; if the read returns less than a full block and the sync conversion is not specified, the resulting output block shall be the same size as the input block. If the bs=expr operand is not specified, or a conversion other than sync, noerror, or notrunc is requested, the input shall be processed and collected into full-sized output blocks until the end of the input is reached. So I stand corrected - because we aren't using conv=sync, there is no zero padding or data corruption; however, behavior can still be improved. With bs=, a short read results in a short write, so we aren't using the full 1M block size that we asked of dd, and end up with more syscalls than necessary. Swapping to ibs= obs= instead of bs= allows dd to do fewer syscalls on output. Patch coming up soon, but I'm grateful that this is not a data corruption bug after all. And the GNU extension of iflag=fullblock is only useful when mixing bs= and conv=sync, so there's no need to use that extension. -- Eric Blake [email protected] +1-801-349-2682 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/libvir-list
