[ 
https://issues.apache.org/jira/browse/SAMZA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937009#comment-14937009
 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-679:
----------------------------------------------------

Hi, [[email protected]], I saw that you have turned on the compaction on the 
coordinator stream configuration. Did you check whether the compaction actually 
happened? Kafka recently fixed KAFKA-1374, which let the compacted log grow 
forever if there is a compressed message in the log. If you are not using 
compression setting in your Kafka producer configuration, you might hit some 
other issues.

> Optimize CoordinatorStream's bootstrap mechanism
> ------------------------------------------------
>
>                 Key: SAMZA-679
>                 URL: https://issues.apache.org/jira/browse/SAMZA-679
>             Project: Samza
>          Issue Type: Sub-task
>            Reporter: Naveen Somasundaram
>             Fix For: 0.10.0
>
>
> At present, when the bootstrap using the CoordinatorStreamConsumer, we read 
> all the messages into a set. Which is fine, if log compaction is working, but 
> given that:
> 1. The log compaction can be turned off/broken for whatever reason
> 2. The is time interval between compaction
> We should consider fixing the bootstrap method to hold only the latest 
> checkpoint (Override equals and hascode of the set is one way to go about it)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to