All data is backed in the Kafka cluster. Data that is stored locally, is basically a cache, and Kafka Streams will recreate the local data if you loose it.
Thus, I am not sure how the KTable data could be stale. One possibility might be a miss-configuration: I assume that you read the topic directly as a table (ie, builder.table("topic")). If you do this, the used input topic must be configured with log compaction --- if it is configured with retention, you might loose data from the input topic and if you also loose the local cache, Kafka Streams cannot recreate the local state because it was deleted from the topic (log compaction will guard the input topic from data loss). -Matthias On 12/24/18 12:22 PM, Edmondo Porcu wrote: > Hello Kafka users, > > we are running a Kafka Streams as a fully stateless application, meaning > that we are not persisting /tmp/kafka-streams on a durable volume but we > are rather losing it at each restart. This application is performing a > KTable-KTable join of data coming from Kafka Connect, and sometimes we want > to force the output to tick so we update records in the right table from > the database, but we see that the left table is "stale". > > Is it possible that because of reboots, the application loses some messages > ? How is the state reconstructed when /tmp/kafka-streams is not available? > Is the state saved in an intermediate topic? > > Thanks, > Edmondo >
signature.asc
Description: OpenPGP digital signature