[
https://issues.apache.org/jira/browse/SAMZA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938233#comment-14938233
]
Edi Bice commented on SAMZA-679:
--------------------------------
Apparently that was not enough as it turns out Kafka is shipped with the log
compacter turned off. Once I added log.cleaner.enable=true it did compact the
__samza_coordinator_xxx topics down to few megabytes.
> Optimize CoordinatorStream's bootstrap mechanism
> ------------------------------------------------
>
> Key: SAMZA-679
> URL: https://issues.apache.org/jira/browse/SAMZA-679
> Project: Samza
> Issue Type: Sub-task
> Reporter: Naveen Somasundaram
> Fix For: 0.10.0
>
>
> At present, when the bootstrap using the CoordinatorStreamConsumer, we read
> all the messages into a set. Which is fine, if log compaction is working, but
> given that:
> 1. The log compaction can be turned off/broken for whatever reason
> 2. The is time interval between compaction
> We should consider fixing the bootstrap method to hold only the latest
> checkpoint (Override equals and hascode of the set is one way to go about it)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)