[ 
https://issues.apache.org/jira/browse/SAMZA-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110856#comment-14110856
 ] 

Yan Fang commented on SAMZA-388:
--------------------------------

+1 for the temporal solution. Few nits posted in RB. feel free to commit. Maybe 
we want to revisit this ticket after KAFKA-1374, or use another ticket to track 
in case we forget to update. Thank you.

> Log compaction on checkpoint topics fails with compression
> ----------------------------------------------------------
>
>                 Key: SAMZA-388
>                 URL: https://issues.apache.org/jira/browse/SAMZA-388
>             Project: Samza
>          Issue Type: Bug
>          Components: kafka
>    Affects Versions: 0.8.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>         Attachments: SAMZA-388-0.patch
>
>
> I have a job that has 10,000+ partitions that it's consuming from. After 
> SAMZA-123, it's been switched to use the GroupBySystemStreamPartition 
> strategy, which means it's got 10,000+ tasks, and thus, 10,000+ checkpoint 
> messages being sent every minute.
> To keep the checkpoint topic from getting too large, we enabled log 
> compaction on the Kafka topic, but we discovered that the topic then grew to 
> be very large. This behavior was triggered because we were sending compressed 
> messages to the Kafka checkpoint topic.
> Based on KAFKA-1374, it appears that we can't use compressed checkpoint 
> topics with log compaction.
> I'm mostly opening this ticket as a place holder for KAFKA-1374. Once the 
> ticket is resolved, we can update the Samza code to default the checkpoint 
> topics to be log compacted (with a small segment size), and not worry about 
> the compression anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to