[
https://issues.apache.org/jira/browse/SAMZA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937009#comment-14937009
]
Yi Pan (Data Infrastructure) commented on SAMZA-679:
----------------------------------------------------
Hi, [[email protected]], I saw that you have turned on the compaction on the
coordinator stream configuration. Did you check whether the compaction actually
happened? Kafka recently fixed KAFKA-1374, which let the compacted log grow
forever if there is a compressed message in the log. If you are not using
compression setting in your Kafka producer configuration, you might hit some
other issues.
> Optimize CoordinatorStream's bootstrap mechanism
> ------------------------------------------------
>
> Key: SAMZA-679
> URL: https://issues.apache.org/jira/browse/SAMZA-679
> Project: Samza
> Issue Type: Sub-task
> Reporter: Naveen Somasundaram
> Fix For: 0.10.0
>
>
> At present, when the bootstrap using the CoordinatorStreamConsumer, we read
> all the messages into a set. Which is fine, if log compaction is working, but
> given that:
> 1. The log compaction can be turned off/broken for whatever reason
> 2. The is time interval between compaction
> We should consider fixing the bootstrap method to hold only the latest
> checkpoint (Override equals and hascode of the set is one way to go about it)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)