Re: KafkaConsumer#poll not returning records for all partitions of topic in single call

2016-03-10 Thread Gerard Klijs
I noticed a similar effect with a test tool, which checked if the order the records were produced in, was the same as the order in which they were consumed. Using only one partition it works fine, but using multiple partitions the order gets messed up. If I'm right this is by design, but I would

Re: Kafka Streams

2016-03-10 Thread Gerard Klijs
Nice read. We just started using kafka, and have multiple cases which need some kind of stream processing. So we most likely will start testing/using it as soon as it will be released, adding stream processing containers to our docker landscape. On Fri, Mar 11, 2016 at 2:42 AM Jay Kreps

Re: Consumer Offsets Topic cleanup.policy

2016-03-10 Thread Achanta Vamsi Subhash
We have changed the __consumer_offsets topic policy to "delete" from "compact" and it is working as expected for us. The segments older than segment.ms are purged like it happens for a normal topic. Offsets fetch and commit also worked fine. Thanks On Tue, Mar 8, 2016 at 12:59 PM, Achanta Vamsi

Re: Kafka Streams

2016-03-10 Thread Jay Kreps
Hey David, Yeah I think the similarity to Spark (and Flink and RxJava) is the stream api style in the DSL. That is totally the way to go for stream processing. We tried really hard to make that work early on when we were doing Samza, but we really didn't understand the whole iterator/observable

Re: Kafka Streams

2016-03-10 Thread David Buschman
Very interesting, looks a lot like many operations from Spark were brought across. Any plans to integrate with the reactive-stream protocol for interoperability with libraries akka-stream and RxJava? Thanks, DaVe. David Buschman d...@timeli.io > On Mar 10, 2016, at 2:26 PM, Jay Kreps

Is there a way to fetch latest offset (0.9 consumer)

2016-03-10 Thread Raghu Angadi
We implemented a Kafka connector for Google Dataflow (streaming). We manually assign partitions to each split. Dataflow SDK lets sources report their backlog, I didn't see any way to find out about latest offset using 0.9 consumer.

Offset Gap

2016-03-10 Thread Vadim Keylis
Hello. We are using kafka 0.8.1. I implemented client based on simple consumer. Does the offset always guarantee to be sequential i.e incremented by 1? Under what circumstances would there be a gap i.e differece between current and next offset larger then 1? What is the best way to handle such

JRuby Kafka v2.1 (0.8.2.2) and v3.1 (0.9.0.1) released.

2016-03-10 Thread Joe Lawson
Hi Everyone, JRuby Kafka has had two maintenance releases recently. Version 3.1 supports Kafka 0.9.0.1 and version 2.1 supports Kafka 0.8.2.2. The most recent fix updated the jar-dependencies plugin to properly distribute the jars without trying to redownload them at gem install. This makes sure

Kafka Streams

2016-03-10 Thread Jay Kreps
Hey all, Lot's of people have probably seen the ongoing work on Kafka Streams happening. There is no real way to design a system like this in a vacuum, so we put up a blog, some snapshot docs, and something you can download and use easily to get feedback:

KafkaConsumer#poll not returning records for all partitions of topic in single call

2016-03-10 Thread Shrijeet Paliwal
Version: 0.9.0.1 I have a test which creates two partitions in a topic, writes data to both partitions. Then a single consumer subscribes to the topic, verifies that it has got the assignment of both partitions in that topic & finally issues a poll. The firs poll always comes back with records of

Re: Kafka broker decommission steps

2016-03-10 Thread Guozhang Wang
Thanks Alexis, these are good points. We are aware of the partition assignment json problem and trying to improve it; as for packaging / configs etc I think the goal of the OS Kafka package itself is to provide universal and simple barebone scripts for operations, and users can wrap / customize

Re: Kafka 0.8 old high level consumer client gets inactive after few hours of inactivity

2016-03-10 Thread Alexis Midon
- To understand what the stuck consumer is doing, it would be useful to collect the logs and a thread dump. I'd try to find out what the fetcher threads are doing. What about the handler/application/stream threads? - are the offsets committed? After a restart, could it be that the consumer is just

Re: Retry Message Consumption On Database Failure

2016-03-10 Thread Christian Posta
Yah that's a good point. That was brought up in another thread. The granularity of what poll() needs to be addressed. It tries to do too many things at once, including heartbeating. Not so sure that's entirely necessary. On Thu, Mar 10, 2016 at 1:40 AM, Michael Freeman

Re: Increasing session.timeout.ms

2016-03-10 Thread tao xiao
You need to change group.max.session.timeout.ms in broker to be larger than what you have in consumer. On Fri, 11 Mar 2016 at 00:24 Michael Freeman wrote: > Hi, > I'm trying to set the following on a 0.9.0.1 consumer. > > session.timeout.ms=12 >

Increasing session.timeout.ms

2016-03-10 Thread Michael Freeman
Hi, I'm trying to set the following on a 0.9.0.1 consumer. session.timeout.ms=12 request.timeout.ms=144000 I get the below error but I can't find any documentation on acceptable ranges. "The session timeout is not within an acceptable range." Logged by AbstractCoordinator Any idea's

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-10 Thread Mansi Shah
I second the need for having a consumer context passed to rebalance callback. I have ran into issues several times because of that. About - subscribe vs assign - I have not read through your spark code yet (will do by eod), so I am not sure what you mean (other than I do agree that new partitions

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-10 Thread Cody Koeninger
Mansi, I'd agree that the fact that everything is tied up in poll seems like the source of the awkward behavior. Regarding assign vs subscribe, most people using the spark integration are just going to want to provide a topic name, not go figure out a bunch of partitions. They're also going to

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-10 Thread Cody Koeninger
Yeah, so I'd encourage you guys to consider fixing that while the consumer is still in beta. As I said, it makes it awkward to serialize or provide a zero-arg constructor for a consumer rebalance listener, which is necessary in our case for restarting a consumer job on behalf of a user. It also

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-10 Thread Mansi Shah
Guozhang Sorry for joining the party a little late. I have been thinking about this whole awkward behavior of having to call poll(0) to actually make the underlying subscriptions take effect. Is the core reason for this design the fact that poll is also the actual heartbeat and you want to make

Kafka 0.8 old high level consumer client gets inactive after few hours of inactivity

2016-03-10 Thread Abhishek Chawla
Hi, I'm using kafka version 0.8.2.1 with old high level consumer api. I'm following this gude: https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example The client is working fine but after few hours of inactivity it gets inactive and stops receiving messages. but if I restart my

Re: Retry Message Consumption On Database Failure

2016-03-10 Thread Michael Freeman
Thanks Christian, We would want to retry indefinitely. Or at least for say x minutes. If we don't poll how do we keep the heart beat alive to Kafka. We never want to loose this message and only want to commit to Kafka when the message is in Mongo. That's either as