Good point. We've only got two disks per node and two topics so I was planning 
to have one disk/partition. 

Our workload is very write heavy so I'm mostly concerned about write 
throughput. Will we get write speed improvements by sticking to 1 
partition/disk or will the difference between 1 and 3 partitions/node be 
negligible?

> On 24/06/2014, at 9:42 pm, Paul Mackles <pmack...@adobe.com> wrote:
> 
> You'll want to account for the number of disks per node. Normally,
> partitions are spread across multiple disks. Even more important, the OS
> file cache reduces the amount of seeking provided that you are reading
> mostly sequentially and your consumers are keeping up.
> 
>> On 6/24/14 3:58 AM, "Daniel Compton" <d...@danielcompton.net> wrote:
>> 
>> I¹ve been reading the Kafka docs and one thing that I¹m having trouble
>> understanding is how partitions affect sequential disk IO. One of the
>> reasons Kafka is so fast is that you can do lots of sequential IO with
>> read-ahead cache and all of that goodness. However, if your broker is
>> responsible for say 20 partitions, then won¹t the disk be seeking to 20
>> different spots for its writes and reads? I thought that maybe letting
>> the OS handle fsync would make this less of an issue but it still seems
>> like it could be a problem.
>> 
>> In our particular situation, we are going to have 6 brokers, 3 in each
>> DC, with mirror maker replication from the secondary DC to the primary
>> DC. We aren¹t likely to need to add more nodes for a while so would it be
>> faster to have 1 partition/node than say 3-4/node to minimise the seek
>> times on disk?
>> 
>> Are my assumptions correct or is this not an issue in practice? There are
>> some nice things about having more partitions like rebalancing more
>> evenly if we lose a broker but we don¹t want to make things significantly
>> slower to get this.
>> 
>> Thanks, Daniel.
> 

Reply via email to