[
https://issues.apache.org/jira/browse/KAFKA-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535133#comment-16535133
]
Matthias J. Sax commented on KAFKA-4113:
----------------------------------------
[~graphex] What you report is a bug and tracked via KAFKA-3514 – it's on the
top list of things we want to fix! I completely agree that KAFKA-3514 breaks
the expected behavior that a KTable with older record timestamps should be
loaded before processing stream records starts. The difference is, that it's a
bug, while the request discussed here is a semantic change request – thus, I
think it's best to discuss both separately.
> Allow KTable bootstrap
> ----------------------
>
> Key: KAFKA-4113
> URL: https://issues.apache.org/jira/browse/KAFKA-4113
> Project: Kafka
> Issue Type: New Feature
> Components: streams
> Reporter: Matthias J. Sax
> Assignee: Guozhang Wang
> Priority: Major
>
> On the mailing list, there are multiple request about the possibility to
> "fully populate" a KTable before actual stream processing start.
> Even if it is somewhat difficult to define, when the initial populating phase
> should end, there are multiple possibilities:
> The main idea is, that there is a rarely updated topic that contains the
> data. Only after this topic got read completely and the KTable is ready, the
> application should start processing. This would indicate, that on startup,
> the current partition sizes must be fetched and stored, and after KTable got
> populated up to those offsets, stream processing can start.
> Other discussed ideas are:
> 1) an initial fixed time period for populating
> (it might be hard for a user to estimate the correct value)
> 2) an "idle" period, ie, if no update to a KTable for a certain time is
> done, we consider it as populated
> 3) a timestamp cut off point, ie, all records with an older timestamp
> belong to the initial populating phase
> The API change is not decided yet, and the API desing is part of this JIRA.
> One suggestion (for option (4)) was:
> {noformat}
> KTable table = builder.table("topic", 1000); // populate the table without
> reading any other topics until see one record with timestamp 1000.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)