The messages for a topic are kept in the kafka broker's memory before they are flushed to the disk. The flush happens based on size and time limits, whichever is hit first. If you kill the kafka server process before any message has been flushed to the disk, those messages will be lost. The config (kafka.server.KafkaConfig) parameters log.flush.interval, log.default.flush.scheduler.interval.ms and log.default.flush.interval.ms at http://kafka.apache.org/configuration.html should help clarify this.
Thanks, Swapnil On 2/19/13 3:28 AM, "Jason Huang" <[email protected]> wrote: >Hello, > >I am confused about "log file flush". In my naive understanding, once >a message is produced and sent to the kafka server, it will be written >to the hard drive at the log file. Since it is in the hard drive >already, what exactly do you mean by "log file flush"? > >I asked because we found that if we manually kill the zookeeper and >kafka server processes, the messages stored in the log file will be >lost. Is this expected behavior? Is there any setting to allow us keep >all the existing messages once they are written to the log file? > >thanks, > >Jason
