Hello there, We ran into a situation on our dev KAFKA cluster (3 nodes, v0.8.2) where we ran out of disk space on one of the nodes. To free up disk space, we reduced log.retention.hours to something more manageable (from 72hrs to 52hrs) as well as we moved the log directory to disk of 200GB. We did this for all 3 nodes.
Now, we are preparing for production and would like to get an understanding of how this works in KAFKA - as our data increases over time, we would like to mount more storage (in chunks of 200GB) at a time and have our topic's storage expand into these mounted directories. KAFKA, by design supports adding more storage space as needed basis.. that is, for example, if we have a topic "myTopic" and we estimate 200GB is a reasonable storage in the beginning (so, mounted on a path /kafkastore1). Later, we realize that it is not sufficient and we need to add another chunk of storage (200GB) and we mount it on /kafkastore2.. . How do I configure my data to expand into different directories as I add more and more space... To be specific, say we configure log.dirs as the following: Log.dirs = /kafkastore1,/kafkastore2 (comma separated) So, when I create my topic "myTopic", can I just create it with partitions = 1 (kafka-topics.sh -create --topic myTopic -partitions 1 In this case, my questions: (1) does the data that belongs to myTopic would automatically expand into /kafkastore2 once /kafkastore1 is completely full? (2) if not, do we have to create the topic with multiple partitions? If we have to create multiple partitions, how can we ensure the order of the messages for consumer (first published be consumed first)? For us, we need to consume the data in the same order as it is published. Thanks, avi lele ****************************************************** This message and any files or attachments sent with this message contain confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute, copy or use any part of this email. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return Email. Email transmission cannot be guaranteed to be secure or error-free as information can be intercepted, corrupted, lost, destroyed, late, incomplete or may contain viruses. The sender, therefore, does not accept liability for any errors or omissions in the contents of this message, which arise as a result of email transmission. ******************************************************