On Sun, Jul 24, 2016 at 1:06 PM, Gleb Natapov <g...@scylladb.com> wrote:

>
> > So it appears that unless you're using mmap(), OSv does *not* have any
> > writeback buffer, not even for ZFS filesystems: If you do a write()
> system
> > call to write 10 bytes, you will create a 10-byte disk operation; If
> > understand you correctly, there is no attempt to somehow coalesce many
> > small writes to one operation, nor is there any attempt to reorder the
> I/O
> > operations to better fit some assumptions on disk performance.
> >
> No, unless you're using mmap() _our_ write page cache will not be used,
>

Yes, that we already understood after your previous clarifications on the
bug tracker. None of Benoit's tests are using mmap(), so our pagecache.cc
is irrelevant to this discussion.


> instead write will go directly into zfs layer which should not generate
> disk write for each write operation unless it is extremely stupid (which
> I doubt) or mounted/configured incorrectly or we use some kind of wrong
> write.
>

Ok, so I think that what Benoit was seeing is that ZFS *is* being extremely
stupid and generating many small writes.

But, Benoit, is it possible that ZFS *is* trying to coalesce writes, just
somehow misconfigured to aim at the wrong size? What I mean is, if you do
write() operations of 10 bytes, will you see 10-byte operations, or are
they still coalesced to  512-byte sizes?
If there is coalescing, just to 512 bytes (not 4096 as you would have
hoped), then clearly ZFS can coalesce the writes but is somehow
misconfigured to make them so small.


> > So basically in OSv, any write() to ZFS is using O_DIRECT even if you
> > didn't ask for that.
> No. Not intentional anyway.
>

Oh. Good to know.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to