[jira] [Issue Comment Deleted] (KAFKA-1510) Force offset commits when migrating consumer offsets from zookeeper to kafka

2014-08-29 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1510:
-

Comment: was deleted

(was: The patch makes the simplest choices:
1. unfiltered commits when storage=kafka (unfiltered to both storages if the 
case).
2. unfiltered retries (even if some of the offsets have already been properly 
sent in previous attempts)

A way to solve point 2 in a more general context would be to account for 
freshly committed offsets, not just offsets values changes when deciding to 
filter an offset or not.)

 Force offset commits when migrating consumer offsets from zookeeper to kafka
 

 Key: KAFKA-1510
 URL: https://issues.apache.org/jira/browse/KAFKA-1510
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Joel Koshy
Assignee: Joel Koshy
  Labels: newbie
 Fix For: 0.8.2

 Attachments: 
 Patch_to_push_unfiltered_offsets_to_both_Kafka_and_potentially_Zookeeper_when_Kafka_is_con.patch


 When migrating consumer offsets from ZooKeeper to kafka, we have to turn on 
 dual-commit (i.e., the consumers will commit offsets to both zookeeper and 
 kafka) in addition to setting offsets.storage to kafka. However, when we 
 commit offsets we only commit offsets if they have changed (since the last 
 commit). For low-volume topics or for topics that receive data in bursts 
 offsets may not move for a long period of time. Therefore we may want to 
 force the commit (even if offsets have not changed) when migrating (i.e., 
 when dual-commit is enabled) - we can add a minimum interval threshold (say 
 force commit after every 10 auto-commits) as well as on rebalance and 
 shutdown.
 Also, I think it is safe to switch the default for offsets.storage from 
 zookeeper to kafka and set the default to dual-commit (for people who have 
 not migrated yet). We have deployed this to the largest consumers at linkedin 
 and have not seen any issues so far (except for the migration caveat that 
 this jira will resolve).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (KAFKA-1510) Force offset commits when migrating consumer offsets from zookeeper to kafka

2014-07-27 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1510:
-

Comment: was deleted

(was: attached the patch with the meaning detailed in my prev comment)

 Force offset commits when migrating consumer offsets from zookeeper to kafka
 

 Key: KAFKA-1510
 URL: https://issues.apache.org/jira/browse/KAFKA-1510
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Joel Koshy
Assignee: nicu marasoiu
  Labels: newbie
 Fix For: 0.8.2

 Attachments: forceCommitOnShutdownWhenDualCommit.patch


 When migrating consumer offsets from ZooKeeper to kafka, we have to turn on 
 dual-commit (i.e., the consumers will commit offsets to both zookeeper and 
 kafka) in addition to setting offsets.storage to kafka. However, when we 
 commit offsets we only commit offsets if they have changed (since the last 
 commit). For low-volume topics or for topics that receive data in bursts 
 offsets may not move for a long period of time. Therefore we may want to 
 force the commit (even if offsets have not changed) when migrating (i.e., 
 when dual-commit is enabled) - we can add a minimum interval threshold (say 
 force commit after every 10 auto-commits) as well as on rebalance and 
 shutdown.
 Also, I think it is safe to switch the default for offsets.storage from 
 zookeeper to kafka and set the default to dual-commit (for people who have 
 not migrated yet). We have deployed this to the largest consumers at linkedin 
 and have not seen any issues so far (except for the migration caveat that 
 this jira will resolve).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (KAFKA-1510) Force offset commits when migrating consumer offsets from zookeeper to kafka

2014-07-27 Thread nicu marasoiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nicu marasoiu updated KAFKA-1510:
-

Comment: was deleted

(was: forcing all to zk too does indeed have the drawback that it will 
typically copy the same offsets again, and not only once but potentially 
several times (if kafka is retried).

However the alternative is to commit to both kafka and zookeeper 
unconditionally in the normal flow (right now, the commit to zk happens only 
after a successful commit to kafka if any). That too poses the same risk of 
committing multiple times to a system (zk) if the other (kafka) needs retries. 
So a clean way here would be a completely different OffsetDAO implementation, 
one on kafka , one on zookeeper, and one on dual mode, and read, as now 
max(both), while write goes to the 2 implementations, each of them doing 
retries without affecting the other!
)

 Force offset commits when migrating consumer offsets from zookeeper to kafka
 

 Key: KAFKA-1510
 URL: https://issues.apache.org/jira/browse/KAFKA-1510
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Joel Koshy
Assignee: nicu marasoiu
  Labels: newbie
 Fix For: 0.8.2

 Attachments: forceCommitOnShutdownWhenDualCommit.patch


 When migrating consumer offsets from ZooKeeper to kafka, we have to turn on 
 dual-commit (i.e., the consumers will commit offsets to both zookeeper and 
 kafka) in addition to setting offsets.storage to kafka. However, when we 
 commit offsets we only commit offsets if they have changed (since the last 
 commit). For low-volume topics or for topics that receive data in bursts 
 offsets may not move for a long period of time. Therefore we may want to 
 force the commit (even if offsets have not changed) when migrating (i.e., 
 when dual-commit is enabled) - we can add a minimum interval threshold (say 
 force commit after every 10 auto-commits) as well as on rebalance and 
 shutdown.
 Also, I think it is safe to switch the default for offsets.storage from 
 zookeeper to kafka and set the default to dual-commit (for people who have 
 not migrated yet). We have deployed this to the largest consumers at linkedin 
 and have not seen any issues so far (except for the migration caveat that 
 this jira will resolve).



--
This message was sent by Atlassian JIRA
(v6.2#6252)