On Jan 22, 2016, at 11:54 AM, James K. Lowden <jklowden at schemamania.org> 
wrote:
> 
> On Fri, 22 Jan 2016 06:24:08 +0000
> Simon Slavin <slavins at bigfraud.org> wrote:
> 
>> This is, of course, all about waiting for a rotating disc to be in
>> the right place.
> 
> All true, but I think you're exaggerating if you're implying that's
> what the user will see.  A call to write(2) doesn't necessarily involve
> the rotating media; it merely transfers the data from userspace to the
> kernel buffer cache (using Linux as an example).  Even fsync, on
> consumer-grade disks, may return when the data have been flushed to the
> device's cache, before they come to rest on the platter.  Both buffers
> ameliorate the effects of latency and track-to-track seek.  

First, SQLite *does* fsync() each transaction before returning, on purpose, to 
provide the D in ACID:

  https://www.sqlite.org/lockingv3.html

Second, even if you?re using the sort of consumer-grade disk that lies about 
fsync [*] you still have seek time to cope with.  The track on disk where the 
data lands is probably not the track where the indices and other metadata 
structures live.  The head may have to go back and forth several times to 
complete a transaction.

Even when the disk lies about fsync, that cost eventually has to be paid.  If 
it?s left unpaid too long, the write buffer fills up, and then SQLite will have 
to wait for buffer space to open up.

> Given...the capacity of the raw disk (about 100 MB/s)

You mean transfer rate, not capacity, of course.

But you only get 100 MByte/sec in linear reads, not random writes, which is 
what multi-track writes effectively are.  Typical disks drop into the single 
digits of MByte/sec on random writes.


[*] See "Disks from the Perspective of a File System?, by Marshall Kirk 
McKusick [**] in ACM Queue: https://queue.acm.org/detail.cfm?id=2367378 

[**] https://en.wikipedia.org/wiki/Marshall_Kirk_McKusick

Reply via email to