A question about something that was always in the back of my mind.

According to Jay Kreps

> The first [reason that Kafka is so fast despite writing to disk] is that 
> Kafka does only sequential file I/O.

I wonder how true this statement is, because Kafka uses 3 segments per 
partition. so even with a single topic and partition per broker and disk, it 
would not be sequential. Now say we have 1000 partitions per broker/disk, i.e. 
3000 files. How can concurrent/interleaved writes to thousands of files on a 
single disk be considered sequential file I/O?

Isn't the reason Kafka is so fast despite writing to disk the fact that it does 
not fsync to disk, leaving that to the OS? The OS would, I assume, be smart 
enough to order the writes when it flushes its caches to disk in a way that 
minimizes random seeks. But then, wouldn't the manner in which Kafka writes to 
files be more or less irrelevant? Or put differently: If Kafka was 
synchronously flushing to disk, wouldn't it have to limit itself to writing all 
partitions for a broker/disk to a single file, if it wanted to do sequential 
file I/O?

For reading (historical, non-realtime) data that is not in the OS cache, 
keeping it in append-only files, the statement makes of course sense.

Reply via email to