New subscriber offset

2015-03-03 Thread Achanta Vamsi Subhash
Hi, We are using HighLevelConsumer and when a new subscription is added to the topic, the HighLevelConsumer for the same group starts from the start of the Kafka topic log. Is there anyway we could set the offset of the HighLevelConsumer to the end of the log instead? We don't want to move to

Re: How to easily get all broker of one topic?

2015-03-03 Thread Guozhang Wang
ZK should be treated as the source of truth of the topic metadata, whereas brokers just keep a cache of this data that can be a bit out-of-date (and that is one possible reason of UnknownTopicOrPartition). So I would still suggest getting data from ZK, and if some leader is not online then its

Re: broker restart problems

2015-03-03 Thread Jun Rao
Do you see broker 1 being deleted in the controller log? Thanks, Jun On Fri, Feb 27, 2015 at 5:25 PM, ZhuGe t...@outlook.com wrote: Thanks for the reply.I confirmed that broker 1 is registered in the zk. Date: Fri, 27 Feb 2015 09:27:52 -0800 Subject: Re: broker restart problems From:

Re: Question on ISR inclusion leader election for failed replica on catchup

2015-03-03 Thread Jun Rao
When K1 crashes before K3 fully catches up, by default, Kafka allows K3 to become the new leader. In this case, data in batch 2 will be lost. Our default behavior favors availability over consistency. If you prefer consistency, you can set unclean.leader.election.enable to false on the broker.

publisher spooling ....!

2015-03-03 Thread sunil kalva
Hi Is there any way to spool messages to disk at publisher side when kafka cluster is down or not reachable for publisher. If kafka doesn't support this feature, what is the best practise to handle this failure scenario. I was referring one of the old jira link which is still open state :

Re: Broker shuts down due to unrecoverable I/O error

2015-03-03 Thread Jun Rao
Which OS is this on? Is this easily reproducible? Thanks, Jun On Sun, Mar 1, 2015 at 8:24 PM, Manikumar Reddy ku...@nmsworks.co.in wrote: Kafka 0.8.2 server got stopped after getting below I/O exception. Any thoughts on below exception? Can it be file system related? [2015-03-01

Re: Broker shuts down due to unrecoverable I/O error

2015-03-03 Thread Manikumar Reddy
Hi, We are running on RedHat Linux with SAN storage. This happened only once. Thanks, Manikumar. On Tue, Mar 3, 2015 at 10:02 PM, Jun Rao j...@confluent.io wrote: Which OS is this on? Is this easily reproducible? Thanks, Jun On Sun, Mar 1, 2015 at 8:24 PM, Manikumar Reddy

Re: New subscriber offset

2015-03-03 Thread tao xiao
You can set the consumer config auto.offset.reset=largest Ref: http://kafka.apache.org/documentation.html#consumerconfigs On Tue, Mar 3, 2015 at 8:30 PM, Achanta Vamsi Subhash achanta.va...@flipkart.com wrote: Hi, We are using HighLevelConsumer and when a new subscription is added to the

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-03 Thread Gwen Shapira
Hi, Good catch, Joe. Releasing with a broken test is not a good habit. I provided a small patch that fixes the issue in KAFKA-1999. Gwen On Tue, Mar 3, 2015 at 9:08 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I have most everything looks good except I keep getting test failures from wget

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-03 Thread Joe Stein
Jun, I have most everything looks good except I keep getting test failures from wget https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/kafka-0.8.2.1-src.tgz tar -xvf kafka-0.8.2.1-src.tgz cd kafka-0.8.2.1-src gradle ./gradlew test kafka.api.ProducerFailureHandlingTest

Re: How to easily get all broker of one topic?

2015-03-03 Thread Guozhang Wang
I meant /brokers/topics/theTopic, no partition child. It already contains the replica list for each partition: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper On Tue, Mar 3, 2015 at 8:00 AM, Guozhang Wang wangg...@gmail.com wrote: ZK should be treated as

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-03 Thread Joe Stein
Ok, lets fix the transient test failure on trunk agreed not a blocker. +1 quick start passed, verified artifacts, updates in scala https://github.com/stealthly/scala-kafka/tree/0.8.2.1 and go https://github.com/stealthly/go_kafka_client/tree/0.8.2.1 look good ~ Joe Stein - - - - - - - - - - - -

Re: Stream naming conventions?

2015-03-03 Thread Julio Castillo
Can you provide some examples on your naming patterns described below? Thanks ** julio On 3/3/15, 6:56 AM, Thunder Stumpges tstump...@ntent.com wrote: I'm not sure who you were asking the question to, but since Gwen's was not bound to any restrictions just a guideline, I'll assume you meant me

Re: publisher spooling ....!

2015-03-03 Thread Jay Kreps
Broker replication is available now and fully documented in the docs. This approach to availability has a lot of advantages discussed in that ticket and the one below. Personally, having tried both approaches, I think this is what most people should do (running a small highly available cluster

Re: New subscriber offset

2015-03-03 Thread Achanta Vamsi Subhash
Thanks a lot Xiao. Somehow missed reading about the config parameter. On Tue, Mar 3, 2015 at 6:51 PM, tao xiao xiaotao...@gmail.com wrote: You can set the consumer config auto.offset.reset=largest Ref: http://kafka.apache.org/documentation.html#consumerconfigs On Tue, Mar 3, 2015 at 8:30 PM,

Re: Using 0.8.2 jars in consumer with producer of version 0.8.1.1

2015-03-03 Thread Jianshi Huang
Nice! I'll try it. Jianshi On Wed, Mar 4, 2015 at 1:33 AM, Jiangjie Qin j...@linkedin.com.invalid wrote: Generally speaking, we prefer the server version to be higher than client versions. But in your particular case, if you are only using a consumer from 0.8.2, I think it should work.

Re: cross-colo writing/reading?

2015-03-03 Thread Yang
thanks guys. it's just quite a lot of ops cost to setup and monitor a separate cluster, connected through mirror maker. sometimes if I have just a single producer/consumer in a new cluster, it would be more desirable to just connect it directly to an existing kafka setup. I remember at least in

Kafka Cluster Upgrade/Migration

2015-03-03 Thread Bryan Baugher
Hi everyone, I'm starting to look at how you might upgrade your cluster even if a major upgrade with non-passive API changes and do this all in uptime (i.e. do not read/write). I realize this may involve knowledge in how Kafka is used, in our case we own all the reading and writing for now. Has

Re: reassign a topic partition which has no ISR and leader set to -1

2015-03-03 Thread Virendra Pratap Singh
Thanks Gwen for your info. I brought another kafka server up with assigned broker id of the failed leader and the reassignment went through. @kafka developers who may be on this distribution list, is there a feature planned in 0.8.2 or can we plan one, where in if the leader and all the replica

Database Replication Question

2015-03-03 Thread Josh Rader
Hi Kafka Experts, We have a use case around RDBMS replication where we are investigating Kafka. In this case ordering is very important. Our understanding is ordering is only preserved within a single partition. This makes sense as a single thread will consume these messages, but our

Re: cross-colo writing/reading?

2015-03-03 Thread Todd Palino
I don't know if I'd go as far as elegant. Functional, definitely. :) Yang, I'm not entirely sure what you're looking for here. You can already specify the acks setting in the producer if you do not care about acknowledgement from your remote produce requests (setting it to 0). If you do care, you

Re: moving replications

2015-03-03 Thread sunil kalva
Is there any way to automate On Mar 3, 2015 11:57 AM, sunil kalva sambarc...@gmail.com wrote: Why can't kafka automatically rebalances partitions with new broker and adjust with existing brokers ? Why should we run manually ? On Tue, Mar 3, 2015 at 6:41 AM, Gwen Shapira gshap...@cloudera.com

Re: Can Mirroring Preserve Every Topic's Partition?

2015-03-03 Thread Guozhang Wang
Hi Alex, Sorry for getting late on this thread. What I originally meant is not the changes in KAFKA-1650 itself, but a slightly new version of MM as a follow-up of KAFKA-1650, details can be seen from here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-3+-+Mirror+Maker+Enhancement

RE: Stream naming conventions?

2015-03-03 Thread Thunder Stumpges
Sure, these are contrived, but you'll get the idea :) Note: the suffixes are generally an enumeration or combination of two enumerations, so the domain of values should always be bounded (so that the number of topics is also bounded). The idea is any time we want to use the same avro schema

Re: Camus reads from multiple offsets in parallel?

2015-03-03 Thread Jun Rao
Camus only fetches from different partitions in parallel. Thanks, Jun On Fri, Feb 27, 2015 at 4:24 PM, Yang tedd...@gmail.com wrote: we have a single partition, and the topic contains 300k events. we fired off a camus job, it finished within 1 minute. this is rather fast. I was guess

Re: Topicmetadata response miss some partitions information sometimes

2015-03-03 Thread Guozhang Wang
Hey Jun, You are right. Previously I thought only in your recent patches you add the partitionWithAvailableLeaders that this gets exposed, however it is the opposite case. Guozhang On Tue, Mar 3, 2015 at 4:40 PM, Jun Rao j...@confluent.io wrote: Guozhang, Actually, we always return all

Re: Database Replication Question

2015-03-03 Thread Xiao
Hey Josh, Sorry, after reading codes, Kafka did fsync the data using a separate thread. The recovery point (oldest transaction timestamp) can be got from the file recovery-point-offset-checkpoint. You can adjust the value config.logFlushOffsetCheckpointIntervalMs, if you think the speed is

Re: Database Replication Question

2015-03-03 Thread Stevo Slavić
Have you considered including order information in messages that are sent to Kafka, and then restoring order in logic that is processing messages consumed from Kafka? http://www.enterpriseintegrationpatterns.com/Resequencer.html Kind regards, Stevo Slavic. On Wed, Mar 4, 2015 at 12:15 AM, Josh

Re: Database Replication Question

2015-03-03 Thread Xiao
Hey Josh, Transactions can be applied in parallel in the consumer side based on transaction dependency checking. http://www.google.com.ar/patents/US20080163222 This patent documents how it work. It is easy to understand, however, you also need to consider the hash collision issues. This has

Re: Database Replication Question

2015-03-03 Thread Guozhang Wang
Additionally to Jay's recommendation, you also need to have some special cares in error handling of the producer in order to preserve ordering since producer uses batching and async sending. That is, if you already sent messages 1,2,3,4,5 to producer but later on be notified that message 3 failed

reassign a topic partition which has no ISR and leader set to -1

2015-03-03 Thread Virendra Pratap Singh
Ran into a situation where both the leader and replica nodes for a few partitions of a given topic went down. So now these partitions have no in-sync replicas and neither any leader (leader set to -1). I tried to reassign these partitions to a different set of brokers using partition

Re: Database Replication Question

2015-03-03 Thread Xiao
Hey Josh, If you put different tables into different partitions or topics, it might break transaction ACID at the target side. This is risky for some use cases. Besides unit of work issues, you also need to think about the load balancing too. For failover, you have to find the timestamp for

Re: reassign a topic partition which has no ISR and leader set to -1

2015-03-03 Thread Gwen Shapira
I hate bringing bad news, but... You can't really reassign replicas if the leader is not available. Since the leader is gone, the replicas have no where to replicate the data from. Until you bring the leader back (or one of the replicas with unclean leader election), you basically lost this

Re: Database Replication Question

2015-03-03 Thread Jay Kreps
Hey Josh, As you say, ordering is per partition. Technically it is generally possible to publish all changes to a database to a single partition--generally the kafka partition should be high throughput enough to keep up. However there are a couple of downsides to this: 1. Consumer parallelism is

Re: cross-colo writing/reading?

2015-03-03 Thread Jeff Schroeder
Mirror maker is about separating latency and failure domains. I think it is a very elegant solution to a difficult problem. My suspicion is that the LinkedIn / Confluent team agrees. On Tue, Mar 3, 2015 at 3:50 PM, Yang tedd...@gmail.com wrote: thanks guys. it's just quite a lot of ops

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-03 Thread Jun Rao
Hi, Joe, Yes, that unit test does have transient failures from time to time. The issue seems to be with the unit test itself and not the actual code. So, this is not a blocker for 0.8.2.1 release. I think we can just fix it in trunk. Thanks, Jun On Tue, Mar 3, 2015 at 9:08 AM, Joe Stein

Re: Using 0.8.2 jars in consumer with producer of version 0.8.1.1

2015-03-03 Thread Jiangjie Qin
Generally speaking, we prefer the server version to be higher than client versions. But in your particular case, if you are only using a consumer from 0.8.2, I think it should work. -Jiangjie (Becket) Qin On 3/2/15, 10:15 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: 0.8.1.1 On Tue, Mar 3,

Re: Kafka producer failed to send but actually does

2015-03-03 Thread Jiangjie Qin
What do you mean by Kafka embedded broker? Anyway, this could happen. For example, producer sends message to broker. After that some network issue occurs and the producer did not got confirmation from broker, so the producer thought the send failed. But the broker actually got the message. The

Re: Stream naming conventions?

2015-03-03 Thread Maciej Jaśkowski
This approach sounds nice at first but it would fail if you start sending the same message but partitioned in different (orthogonal) ways. How would you go about that? Maciej On 25 February 2015 at 05:17, Gwen Shapira gshap...@cloudera.com wrote: Nice :) I like the idea of tying topic name to

Kafka producer failed to send but actually does

2015-03-03 Thread Arunkumar Srambikkal (asrambik)
Hi, I'm running some tests with the Kafka embedded broker and I see cases where the producer gets the FailedToSendMessageException but in reality the message is transferred and consumer gets it Is this expected / known issue? Thanks Arun My producer config = props.put(producer.type,

RE: Stream naming conventions?

2015-03-03 Thread Thunder Stumpges
I'm not sure who you were asking the question to, but since Gwen's was not bound to any restrictions just a guideline, I'll assume you meant me :) We have a concept of a topic suffix property that is some property in the data that can change dynamically. The full topic name then becomes