On Thu, Mar 06, 2014 at 04:37:47pm +0100, Jose A. Lopes wrote:
> > >>What happens if one doesn't want to allocate any space on
> > >>the node? Wouldn't it better for the mechanism to directly
> > >>pass the image to the target disk, if the admin chooses so?
> > >>(e.g.: by doing curl <location> | dd <target_disk)
> > >Is 'dd' used to limit the maximum file size, given that curl cannot do
> > >this reliably?
> > 
> > We do this to buffer curl's output and send it to the disk
> > in bigger chunks, by tuning dd's bs parameter. Also, in
> > general we find that using oflag=direct with dd helps with
> > not polluting the host page cache with streaming image data,
> > and has better performance.
> 
> Can you elaborate on 'buffer curl's output' ?
> 

Hello Jose,

If you do curl URL >/dev/sda, curl will write() on a file descriptor
for /dev/sda directly. So, if curl has an internal buffer of 4k for
example, every write will be a 4k write.

If you insert dd in the pipeline, it doesn't matter how often curl
invokes write(), you can tune the size of the output buffer for dd
independently.

I'm playing with "strace dd ibs=1M obs=1M". If I just type a few
characters at a time, dd buffers them and only write()s when the buffer
is full, or the input ends (EOF).

In our case, this is important, because we do direct I/O writes to the
disk when writing out the Image.

Hope this makes it more clear,
Vangelis.

Attachment: signature.asc
Description: Digital signature

Reply via email to