Re: Data replication and zero data loss

2015-05-02 Thread xiao li
Hi, Joong, Please check the following two links: - https://cwiki.apache.org/confluence/display/KAFKA/KIP-3+-+Mirror+Maker+Enhancement - https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API They might help you understand the problem. Cheers, Xiao Li

Re: Data replication and zero data loss

2015-05-01 Thread Joong Lee
It is based on our understanding from reading the documents. We aren't concerned of data duplication as that is going to be handled by elasticsearch. On May 1, 2015, at 12:15 AM, Daniel Compton daniel.compton.li...@gmail.com wrote: When we evaluated MirrorMaker last year we didn't find

Re: Data replication and zero data loss

2015-05-01 Thread Joong Lee
0.8.2.1 On Apr 30, 2015, at 11:28 PM, Jiangjie Qin j...@linkedin.com.INVALID wrote: Which mirror maker version did you look at? The MirrorMaker in trunk should not have data loss if you just use the default setting. On 4/30/15, 7:53 PM, Joong Lee jo...@me.com wrote: Hi, We are

Re: Data replication and zero data loss

2015-05-01 Thread Joe Stein
If you want 0 data loss you should also look into the min.insync.repica setting in 0.8.2.1 as it guarantees data in multiple racks. If you don't have that set then you have this scenario as possible. lets say 1 topic, 1 partition, replication 3. You are producing with ACK=-1 b1, b2, b3 (where

Re: Data replication and zero data loss

2015-04-30 Thread Jiangjie Qin
Which mirror maker version did you look at? The MirrorMaker in trunk should not have data loss if you just use the default setting. On 4/30/15, 7:53 PM, Joong Lee jo...@me.com wrote: Hi, We are exploring Kafka to keep two data centers (primary and DR) running hosts of elastic search nodes in

Re: Data replication and zero data loss

2015-04-30 Thread Daniel Compton
When we evaluated MirrorMaker last year we didn't find any risk of data loss, only duplicate messages in the case of a network partition. Did you discover data loss in your tests, or were you just looking at the docs? On Fri, 1 May 2015 at 4:31 pm Jiangjie Qin j...@linkedin.com.invalid wrote:

Data replication and zero data loss

2015-04-30 Thread Joong Lee
Hi, We are exploring Kafka to keep two data centers (primary and DR) running hosts of elastic search nodes in sync. One key requirement is that we can't lose any data. We POC'd use of MirrorMaker and felt it may not meet out data loss requirement. I would like ask the community if we should