Re: integrate Camus and Hive?

2015-03-12 Thread François Langelier
I'm not sure what you are looking for but in case that can help you, We are persisting the data from our kafka cluster in camus and map it in hive with Camus2Hive, you can look at it here if you want to! https://github.com/mate1/camus2hive François Langelier Étudiant en génie Logiciel - École

zookeeper-less offset management

2015-03-12 Thread Pierre-Yves Ritschard
Hi list, I was under the impression that consumers still needed to interact with zookeeper to track their offset. Going through recent Jiras to track the progress I see that https://issues.apache.org/jira/browse/KAFKA-1000 and https://issues.apache.org/jira/browse/KAFKA-1012 seem to indicate that

Re: Database Replication Question

2015-03-12 Thread Jay Kreps
Xiao, Not sure about AIX or HP-UX. There are some people running on Windows, though we don't do real systemic testing against that. I would be surprised if z/os worked, someone would have to try. The existing fsync policy already works at the batch level, and Kafka already does batching quite

High Level Consumer Example in 0.8.2

2015-03-12 Thread ankit tyagi
Hi All, we are upgrading our kafka client version from 0.8.0 to 0.8.2. Is there any document for High level kafka consumer withMultiple thread like https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example for this newer version.

Re: Database Replication Question

2015-03-12 Thread Guozhang Wang
Jay, I think what Xiao was proposing is allowing synchronous fsync, i.e. calling fsync after appending messages to the log, for applications that require disk data persistency. Maybe we can add another ack condition in addition to #.isr replicated that requires the data to be persisted to disk

Re: integrate Camus and Hive?

2015-03-12 Thread Andrew Otto
Hm, aye, I haven’t tried to write a custom partitioner, and that does look pretty easy. I’ll put that on my backlog to think about. The Camus team in the past has been excited to accept patches, and I think if I Hive partitioner came with Camus it would make it much easier to use. Oh wait,

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
something like dynamic filtering that can be updated at runtime or deny all but allow a certain set of topics that cannot be specified easily by regex On Thu, Mar 12, 2015 at 9:06 PM, Guozhang Wang wangg...@gmail.com wrote: Hmm, what kind of customized filtering do you have in mind? I thought

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
A little more context about my needs: I have a requirement that I need to start/stop a topic at runtime based on a event sent to MM. at the moment I need to bounce the MM and find a way to exclude the topic from whitelist which is not an easy job with regex. If I can pass in a combination of

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
Thank you Guozhang for your advice. A dynamic topic filter is what I need so that I can stop a topic consumption when I need to at runtime. On Thu, Mar 12, 2015 at 9:21 PM, Guozhang Wang wangg...@gmail.com wrote: 1. Dynamic: yeah that is sth. we could think of, this could be useful

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread Guozhang Wang
Hmm, what kind of customized filtering do you have in mind? I thought with --whitelist you could already specify regex to do filtering. On Thu, Mar 12, 2015 at 5:56 AM, tao xiao xiaotao...@gmail.com wrote: Hi Guozhang, I was meant to be topicfilter not topic-count. sorry for the confusion.

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread Guozhang Wang
Note that with filtering in message handler, records from the source cluster are still considered as consumed since the offsets will be committed. If you change the filtering dynamically back to whilelist these topics, you will lose the data that gets consumed during the period of the blacklist.

High Replica Max Lag

2015-03-12 Thread Zakee
With the producer throughput as large as 150MB/s to 5 brokers on a continuous basis, I see a consistently high value for Replica Max Lag (in millions). Is this normal or there is a way to tune so as to reduce replica MaxLag? As per documentation, replica max lag (in messages) between follower

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
Yes, that will work. message handle can filter out message sent from certain topics On Fri, Mar 13, 2015 at 6:30 AM, Jiangjie Qin j...@linkedin.com.invalid wrote: No sure if it is an option. But does filtering out topics with message handler works for you? Are you going to resume consuming

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
I am not sure how MM is going to be rewritten. Based on the current implementation in trunk offset is not committed unless it is produced to destination. With assumption that this logic remains MM will not acknowledge the offset back to source for filtered message. So I think it is safe to filter

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread Zakee
Is this always the case that there is only one fetcher per broker, won’t setting num.replica.fetchers greater than number-of-brokers cause more fetchers per broker? Let’s I have 5 brokers, and num of replica fetchers is 8, will there be 2 fetcher threads pulling from each broker? Thanks

Re: [ANNOUNCEMENT] Apache Kafka 0.8.2.1 Released

2015-03-12 Thread Neha Narkhede
Thanks for driving this Jun and everyone for the contributions! On Wed, Mar 11, 2015 at 12:01 PM, Jun Rao jun...@apache.org wrote: The Apache Kafka community is pleased to announce the release for Apache Kafka 0.8.2.1. The 0.8.2.1 release fixes 4 critical issues in 0.8.2.0. All of the

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
Yes, you are right. a dynamic topicfilter is more appropriate where I can filter topics at runtime via some kind of interface e.g. JMX On Thu, Mar 12, 2015 at 11:03 PM, Guozhang Wang wangg...@gmail.com wrote: Tao, Based on your description I think the combination of whitelist / blacklist

blog on choosing # topics/partitions in Kafka

2015-03-12 Thread Jun Rao
Since this is a commonly asked question in the mailing list, I summarized some of the considerations in a bit more detail in the following blog. http://blog.confluent.io/2015/03/12/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ Thanks, Jun

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread Guozhang Wang
Tao, Based on your description I think the combination of whitelist / blacklist will not achieve your goal, since it is still static. Guozhang On Thu, Mar 12, 2015 at 6:30 AM, tao xiao xiaotao...@gmail.com wrote: Thank you Guozhang for your advice. A dynamic topic filter is what I need so

Re: blog on choosing # topics/partitions in Kafka

2015-03-12 Thread Gwen Shapira
Nice! Thank you, this is super helpful. I'd add that not just the client can run out of memory with large number of partitions - brokers can run out of memory too. We allocate max.message.size * #partitions on each broker for replication. Gwen On Thu, Mar 12, 2015 at 8:53 AM, Jun Rao

Broker Errors - Connection reset by peer

2015-03-12 Thread Zakee
I was wondering what could be the root cause of the below ERROR logs in broker hosts. I see the connection closed / reset as INFO logs but for some shows due to error and it does not provide enough clues to it. Appreciate any ideas... [2015-03-12 11:22:02,019] ERROR Closing socket for

leadership election error ?

2015-03-12 Thread Victor L
I have the following error at kafka broker, even if I run single instance: Unable to get Messaging Service:kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes Any advise on how i can change my configuration to

Kafka vs Amps (60east)

2015-03-12 Thread John Lonergan
Has anyone done a comparison of these two. How do they compare in terms of features and scale but also disaster recovery provision, convenience. Operability etc

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread James Cheng
Ah, I understand now. I didn't realize that there was one fetcher thread per broker. Thanks Tao Guozhang! -James On Mar 11, 2015, at 5:00 PM, tao xiao xiaotao...@gmail.com wrote: Fetcher thread is per broker basis, it ensures that at lease one fetcher thread per broker. Fetcher thread is

Re: Idle/dead producer connections on broker

2015-03-12 Thread Allen Wang
I wrote a simplified test program that creates 10 producers and sends a few messages each and then becomes idle. For both 0.8.1.1 and 0.8.2.0, the connections on brokers are gone once the producer instance is terminated. In prod environment where there are many many more producer instances, we

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
Hi Guozhang, I was meant to be topicfilter not topic-count. sorry for the confusion. What I want to achieve is to pass my own customized topicfilter to MM so that I can filter out topics what ever I like. I know MM doesn't support this now. I am just thinking if this is a good feature to add in

Re: High level consumer replaying the messages

2015-03-12 Thread Bhosale, Deepti
Hi Arjun, I am also seeing similar issue for my consumer group. Were you able to figure out cause of this? Thanks, Deepti

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread Jiangjie Qin
No sure if it is an option. But does filtering out topics with message handler works for you? Are you going to resume consuming from a topic after you stop consuming from it? Jiangjie (Becket) Qin On 3/12/15, 8:05 AM, tao xiao xiaotao...@gmail.com wrote: Yes, you are right. a dynamic

when Kafk raequest.required.acks is 1

2015-03-12 Thread gaurav agarwal
in kafka 0.8.1.1 When Kafka Producer set the property of request.required.acks=1 ,It means that the producer gets an acknowledgement after the leader replica has received the data . How will Producer come to know he got the acknowledgment , Is there any api that i can see at my application level ,

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread Guozhang Wang
Hi Tao, Sorry I was mistaken before, yes in MM you can only directly specify --whitelist, --blacklist, and the number of streams you want to create via --num.streams, but cannot set specific topic-count. This is because MM is mainly used for cross DC replication, and hence usually will pipe all

Re: High Level Consumer Example in 0.8.2

2015-03-12 Thread Ewen Cheslack-Postava
You actually only need kafka_2.10-0.8.2.1 because it depends on kafka-clients-0.8.2.1 so the new producer code will get pulled in transitively. But there's nothing wrong with explicitly stating the dependency. However, I wouldn't mix versions (i.e. use kafka-clients-0.8.2.1 instead of

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-12 Thread tao xiao
The topic list is not specified in consumer.properties and I don't think there is any property in consumer config that allows us to specify what topics we want to consume. Can you point me to the property if there is any? On Thu, Mar 12, 2015 at 12:14 AM, Guozhang Wang wangg...@gmail.com wrote:

Re: Database Replication Question

2015-03-12 Thread Xiao
Hi, Jay and Guozhang, In the long term, I think Kafka can be evolved to replace general-purpose messaging systems. For example, if I were a customer who is using IBM MQ, could we use Kafka instead with very minor code changes?

Re: Out of Disk Space - Infinite loop

2015-03-12 Thread tao xiao
Did you stop mirror maker? On Thu, Mar 12, 2015 at 8:27 AM, Saladi Naidu naidusp2...@yahoo.com.invalid wrote: We have 3 DC's and created 5 node Kafka cluster in each DC, connected these 3 DC's using Mirror Maker for replication. We were conducting performance testing using Kafka Producer