Re: Simple consumer rebalancing

2015-05-20 Thread Gwen Shapira
See inline. On Wed, May 20, 2015 at 12:51 PM, Harut Martirosyan harut.martiros...@gmail.com wrote: Hi. I've got several questions. 1. As far as I understood from docs, if rebalancing feature is needed (when consumer added/removed), High-level Consumer should be used, what about Simple

Need some guidance in how to solve this problem.

2015-05-20 Thread Kevin Sjöberg
Hi everyone! This is my first post in the mailing list, so bear with me. I have a backend service written in PHP. This service pushes messages to Apache Kafka (over the topic posts) when posts are created, read and removed. I also have a backend service written in Java. This service consumes

Re: [Announcement] Hermes - pub / sub broker built on top of Kafka

2015-05-20 Thread Daniel Compton
Hi Adam Firstly, thanks for open sourcing this, it looks like a great tool and I can imagine a lot of people will find it very useful. I had a few thoughts reading the docs. I may have misunderstood things but it seems that your goal of meeting a strict SLA conflicts with your goal of

Simple consumer rebalancing

2015-05-20 Thread Harut Martirosyan
Hi. I've got several questions. 1. As far as I understood from docs, if rebalancing feature is needed (when consumer added/removed), High-level Consumer should be used, what about Simple consumer, does it support rebalancing? 2. Also, who implements rebalancing logic, broker (I assume) or

Optimal number of partitions for topic

2015-05-20 Thread Carles Sistare
Hi, We are implementing a Kafka cluster with 9 brokers into EC2 instances, and we are trying to find out the optimal number of partitions for our topics, finding out the maximal number in order not to update the partition number anymore. What we understood is that the number of partitions

Re: Need some guidance in how to solve this problem.

2015-05-20 Thread Kevin Sjöberg
Thanks for the link, Sharninder. I'm not entirely sure this is what I'm looking for. I do not know whether or not the client actually received the messages or not. The Java-application consumes the messages from Kafka and then push them to Pusher (https://pusher.com/). The client, connected to

Re: Need some guidance in how to solve this problem.

2015-05-20 Thread Sharninder
I have a backend service written in PHP. This service pushes messages to Apache Kafka (over the topic posts) when posts are created, read and removed. I also have a backend service written in Java. This service consumes messages from Apache Kafka (for the posts topic) and push them out over

Optimal number of partitions for topic

2015-05-20 Thread Carles Sistare
Hi, We are implementing a Kafka cluster with 9 brokers into EC2 instances, and we are trying to find out the optimal number of partitions for our topics, finding out the maximal number in order not to update the partition number anymore. What we understood is that the number of partitions

Re: The origin of the name

2015-05-20 Thread Peter Vandenabeele
A nice side effect is that we are getting into all things 'A' * Kafka * Samza * Data * Scala * Akka * Spark ... Peter On Wed, May 20, 2015 at 6:16 PM, Mark Reddy mark.l.re...@gmail.com wrote: From Jay Kreps: I thought that since Kafka was a system optimized for writing using a writer's

Re: unclean.leader.election.enable question

2015-05-20 Thread tao xiao
Thank you Mayuresh for the quick reply. If my producer has acks=all set would the producer get callback indicating the missing 2000 messages unsuccessful delivery assuming new Java producer is used On Wednesday, May 20, 2015, gharatmayures...@gmail.com wrote: This is not unclean leader election

RE: consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Padgett, Ben
I am using Kafka v0.8.2.0 From: Guozhang Wang [wangg...@gmail.com] Sent: Wednesday, May 20, 2015 9:41 AM To: users@kafka.apache.org Subject: Re: consumer poll returns no records unless called more than once, why? Hello Ben, Which version of Kafka are you

Re: The origin of the name

2015-05-20 Thread Mark Reddy
From Jay Kreps: I thought that since Kafka was a system optimized for writing using a writer's name would make sense. I had taken a lot of lit classes in college and liked Franz Kafka. Plus the name sounded cool for an open source project. So basically there is not much of a relationship.

Re: Need some guidance in how to solve this problem.

2015-05-20 Thread Guozhang Wang
Hi Kevin, The current high-level Scala consumer does not have a rewind function but if the other client is able to notify you through WebSocket periodically as long as it is not disconnected, then what you can do is let your Java app to buffer messages even after pushing them to the WebSocket,

Re: consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Jay Kreps
Hey Ben, The consumer actually doesn't promise to return records on any given poll() call and even in trunk it won't return records on the first call likely. Internally the reason is that it basically does one or two rounds of non-blocking actions and then returns. This could include things like

Re: consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Guozhang Wang
Hello Ben, Which version of Kafka are you using with this consumer client? Guozhang On Wed, May 20, 2015 at 9:03 AM, Padgett, Ben bpadg...@illumina.com wrote: //this code Properties consumerProps = new Properties(); consumerProps.put(bootstrap.servers,

unclean.leader.election.enable question

2015-05-20 Thread tao xiao
Hi team, I know that if a broker is behind the leader by no more than replica.lag.max.messages the broker is considered in sync with the leader. Considering a situation where I have unclean.leader.election.enable=true set in brokers and the follower is now 2000 messages behind (the default

Re: unclean.leader.election.enable question

2015-05-20 Thread gharatmayuresh15
This is not unclean leader election since the follower is still in ISR. Yes we will loose those 2000 messages. Mayuresh Sent from my iPhone On May 20, 2015, at 8:31 AM, tao xiao xiaotao...@gmail.com wrote: Hi team, I know that if a broker is behind the leader by no more than

consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Padgett, Ben
//this code Properties consumerProps = new Properties(); consumerProps.put(bootstrap.servers, localhost:9092); //without deserializer it fails, which makes sense. the documentation however doesn't show this consumerProps.put(key.deserializer,

Re: consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Guozhang Wang
Hello Ben, This Java consumer client was still not mature in 0.8.2.0 and lots of bug fixes have been checked in since then. I just test your code with trunk's consumer and it does not illustrate this problem. Could you try the same on your side and see if this issue goes away? Guozhang On Wed,

Re: KafkaConsumer poll always returns null

2015-05-20 Thread Ewen Cheslack-Postava
I don't have any small examples handy, but the javadoc for KafkaConsumer includes some examples. The one labeled Simple Processing should work fine as long as you stick to a single consumer in the group. On Wed, May 20, 2015 at 7:49 AM, Padgett, Ben bpadg...@illumina.com wrote: @Ewen

RE: consumer poll returns no records unless called more than once, why?

2015-05-20 Thread Padgett, Ben
Thanks for the detailed explanation. I was simply testing Kafka for the first time with a few throw away unit tests to learn it works and was curious why I was receiving that behavior. From: Jay Kreps [jay.kr...@gmail.com] Sent: Wednesday, May 20, 2015

0.8.2 changelog difficulties

2015-05-20 Thread Ian Friedman
Hey guys, been a while since I sent a message to this list. I have not been following Kafka development closely for the past 9 or so months, but I'm now evaluating upgrading our installation to Kafka 0.8.2 and wanted to share my experiences in attempting to get a handle on that. First thing I

Re: Optimal number of partitions for topic

2015-05-20 Thread Manoj Khangaonkar
With knowing the actual implementation details, I would get guess more partitions implies more parallelism, more concurrency, more threads, more files to write to - all of which will contribute to more CPU load. Partitions allow you to scale by partitioning the topic across multiple brokers.

Re: Optimal number of partitions for topic

2015-05-20 Thread Daniel Compton
One of the beautiful things about Kafka is that it uses the disk and OS disk caching really efficiently. Because Kafka writes messages to a contiguous log, it needs very little seek time to move the write head to the next point. Similarly for reading, if the consumers are mostly up to date with

Re: Optimal number of partitions for topic

2015-05-20 Thread Saladi Naidu
In general partitions are to improve throughput by parallelism. From your explanation below yes partitions are written to different physical locations but still append only. With write ahead buffering and append only writes, having partitions still will increase throughput. Below is an

Re: 0.8.2 changelog difficulties

2015-05-20 Thread Guozhang Wang
Hi Ian, Thanks for the email. Whenever we decided to cut a new release of Kafka one of the committers will go through all the tickets with fix version tagged to the release version and make sure they are all resolved. For some cover-story tickets like KAFKA-1000, I agree that we did not handle

RE: KafkaConsumer poll always returns null

2015-05-20 Thread Padgett, Ben
@Ewen Cheslack-Postava - do you have an example you could post? From: Ewen Cheslack-Postava [e...@confluent.io] Sent: Tuesday, May 19, 2015 3:12 PM To: users@kafka.apache.org Subject: Re: KafkaConsumer poll always returns null The new consumer in trunk is

The origin of the name

2015-05-20 Thread András Serény
Hi All, I wonder, how the messaging system Kafka has got its name? I believe it was named after Franz Kafka the writer (the names of the systems Samza and Camus reinforce this belief), but how and why? Is there a story behind it? Thanks a lot, András