Hello there,

We ran into a situation on our dev KAFKA cluster (3 nodes, v0.8.2) where we ran 
out of disk space on one of the nodes. To free up disk space, we reduced 
log.retention.hours to something more manageable (from 72hrs to 52hrs) as well 
as we moved the log directory to disk of 200GB. We did this for all 3 nodes.

Now, we are preparing for production and would like to get an understanding of 
how this works in KAFKA - as our data increases over time, we would like to 
mount more storage (in chunks of 200GB) at a time and have our topic's storage 
expand into these mounted directories. KAFKA, by design supports adding more 
storage space as needed basis.. that is, for example, if we have a topic 
"myTopic" and we estimate 200GB is a reasonable storage in the beginning (so, 
mounted on a path /kafkastore1). Later, we realize that it is not sufficient 
and we need to add another chunk of storage (200GB) and we mount it on 
/kafkastore2.. .

How do I configure my data to expand into different directories as I add more 
and more space...

To be specific, say we configure log.dirs as the following:

Log.dirs = /kafkastore1,/kafkastore2 (comma separated)

So, when I create my topic "myTopic", can I just create it with partitions = 1 
(kafka-topics.sh -create --topic myTopic -partitions 1

In this case, my questions:

(1) does the data that belongs to myTopic would automatically expand into 
/kafkastore2 once /kafkastore1 is completely full?
(2) if not, do we have to create the topic with multiple partitions? If we have 
to create multiple partitions, how can we ensure the order of the messages for 
consumer (first published be consumed first)? For us, we need to consume the 
data in the same order as it is published.


Thanks,
avi lele


******************************************************
This message and any files or attachments sent with this message contain 
confidential information and is intended only for the individual named.  If you 
are not the named addressee, you should not disseminate, distribute, copy or 
use any part of this email.  If you have received this message in error, please 
delete it and all copies from your system and notify the sender immediately by 
return Email.

Email transmission cannot be guaranteed to be secure or error-free as 
information can be intercepted, corrupted, lost, destroyed, late, incomplete or 
may contain viruses.  The sender, therefore, does not accept liability for any 
errors or omissions in the contents of this message, which arise as a result of 
email transmission.
******************************************************

Reply via email to