[ 
https://issues.apache.org/jira/browse/KAFKA-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1510:
---------------------------------

    Attachment: Unfiltered_to_kafka,_Incremental_to_Zookeeper.patch

Attached a patch - I am doing unfiltered commits to kafka and using offsets 
checkpoint Map for zookeeper incremental commits only (in both zk storage and 
dual commit modes) - its reads and mutations are now part of the commitToZk 
method exclusively in the suggested approach.

Right now, the rearrangement of topic topicRegistry into offsettsToCommit, a 
different reified structure with no more filtering in the process of its 
reification seems a bit futile, but because we got .size if, and we got usage 
of the structure below, and to minimize changes brought by this task (and leave 
them for an obvious future need of refactoring on the bigger scale this class), 
I let it like this.

The other optimization I could do, but not included in the patch, is to keep a 
state of the commit timestamp for each partition, and use that for filtering 
commits to kafka, based on a configurable maximum idleness of the partition 
offset commit for each partition.

A more primitive form of the same optimization, that would only protect from 
repeatedly committing to good brokers because of the broken ones, I could have 
such a state in a local structure for the duration of the method, just to make 
sure we keep retrying only the failed commits.

> Force offset commits when migrating consumer offsets from zookeeper to kafka
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-1510
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1510
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.2
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: newbie
>             Fix For: 0.8.2
>
>         Attachments: 
> Patch_to_push_unfiltered_offsets_to_both_Kafka_and_potentially_Zookeeper_when_Kafka_is_con.patch,
>  Unfiltered_to_kafka,_Incremental_to_Zookeeper.patch
>
>
> When migrating consumer offsets from ZooKeeper to kafka, we have to turn on 
> dual-commit (i.e., the consumers will commit offsets to both zookeeper and 
> kafka) in addition to setting offsets.storage to kafka. However, when we 
> commit offsets we only commit offsets if they have changed (since the last 
> commit). For low-volume topics or for topics that receive data in bursts 
> offsets may not move for a long period of time. Therefore we may want to 
> force the commit (even if offsets have not changed) when migrating (i.e., 
> when dual-commit is enabled) - we can add a minimum interval threshold (say 
> force commit after every 10 auto-commits) as well as on rebalance and 
> shutdown.
> Also, I think it is safe to switch the default for offsets.storage from 
> zookeeper to kafka and set the default to dual-commit (for people who have 
> not migrated yet). We have deployed this to the largest consumers at linkedin 
> and have not seen any issues so far (except for the migration caveat that 
> this jira will resolve).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to