Doubts Kafka

2015-02-08 Thread Eduardo Costa Alfaia
Hi Guys, I have some doubts about the Kafka, the first is Why sometimes the applications prefer to connect to zookeeper instead brokers? Connecting to zookeeper could create an overhead, because we are inserting other element between producer and consumer. Another question is about the

Can't send a keyedMessage to brokers with partitioner.class=kafka.producer.DefaultPartitioner

2015-02-08 Thread Zijing Guo
Hi,I have a 2 nodes kafka cluster with 2 instances of brokers and zookeepers. And then I create a topic kafka-test with 2 partitions and replication-factor =2. My producer config is:                      {partitioner.class kafka.producer.DefaultPartitioner                     

Re: high cpu and network traffic when cluster has no topic

2015-02-08 Thread Jay Kreps
Sounds like this patch fixed the issue. It would be good to get some review on KAFKA-1919--it is only a four line change. On Wed, Feb 4, 2015 at 1:15 PM, Steven Wu stevenz...@gmail.com wrote: Bhavesh, unfortunately, ps cmd in Mac doesn't display thread id. I tried DTrace, but it only shows

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
Hey Otis, Yeah, Gwen is correct. The future from the send will be satisfied when the response is received so it will be exactly the same as the performance of the sync producer previously. -Jay On Mon, Feb 2, 2015 at 1:34 PM, Gwen Shapira gshap...@cloudera.com wrote: If I understood the code

Console Producer Throwing LeaderNotAvailableException Despite Existing Leader for Partition

2015-02-08 Thread Alex Melville
Howdy all, I recently upgraded to Kafka 0.8.2.0 and am trying to verify that everything still works as expected. I spin up two brokers, one zk instance, and then create a topic using kafka-topics.sh --create --zookeeper ad-0104:2181 --topic deleteme --partitions 2 --replication-factor 1 Then I

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
I implemented the flush() call I hypothesized earlier in this thread as a patch on KAFKA-1865. So now producer.flush() will block until all buffered requests complete. The post condition is that all previous send futures are satisfied and have error/offset information. This is a little easier

Re: Not found NewShinyProducer sync performance metrics

2015-02-08 Thread Manikumar Reddy
You can configure the jmx ports by using below system properties. -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port=8889 On Fri, Feb 6, 2015 at 9:19 AM, Xinyi Su xiny...@gmail.com wrote: Hi, I try to use Jconsole to connect remote Kafka broker which is

one message consumed by both consumers in the same group?

2015-02-08 Thread Yang
my understanding is that for 2 consumers within the same group, a message goes to only one of the consumers, not both. but I did a simple test: get the kafka distro, copy the code dir to /tmp/kafka/ then run console consumer from both dirs at the same time bin/kafka-console-consumer.sh --.

Re: one message consumed by both consumers in the same group?

2015-02-08 Thread Manikumar Reddy
Hi, bin/kafka-console-consumer.sh --. all the parameters are the same You need to set same group.id to create a consumer group. By default console consumer creates a random group.id. You can set group.id by using --consumer.config /tmp/comsumer.props flag. $$echo group.id=1

High Latency in Kafka

2015-02-08 Thread Vineet Mishra
Hi All, I am having some log files of around 30GB, I am trying to event process these logs by pushing them to Kafka. I could clearly see the throughput achieved while publishing these event to Kafka is quiet slow. So as mentioned for the single log file of 30GB, the Logstash is continuously

Re: Console Producer Throwing LeaderNotAvailableException Despite Existing Leader for Partition

2015-02-08 Thread tao xiao
Alex, I got similar error before due to incorrect network binding of my laptop's wireless interface. You can try with setting advertised.host.name=kafka's server hostname in the server.properties and run it again. On Sun, Feb 8, 2015 at 8:38 AM, Alex Melville amelvi...@g.hmc.edu wrote: Howdy

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
Steve In terms of mimicing the sync behavior, I think that is what .get() does, no? We are always returning the offset and error information. The example I gave didn't make use of it, but you definitely can make use of it if you want to. -Jay On Wed, Feb 4, 2015 at 9:58 AM, Steve Morin

Re: How to delete defunct topics

2015-02-08 Thread Jagbir Hooda
Hi Joel, The mbean approach depends upon storing traffic stats over a period of time, so is doable but complex. I am thinking about another approach to check logfiles time-stamp (to check producer activity) in conjunction with the mtime for topic offsets (consumers activity). So if both timestamp

[DISCUSS] KIP-8 Add a flush method to the new Java producer

2015-02-08 Thread Jay Kreps
Following up on our previous thread on making batch send a little easier, here is a concrete proposal to add a flush() method to the producer: https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API A proposed implementation is here:

Re: Doubts Kafka

2015-02-08 Thread Christopher Piggott
Have there been any changes with 0.8.2 in how the marker gets moved when you use the high-level consumer? One problem I have always had was: what if I pull something from the stream, but then I have an error in processing it? I don't really want to move the marker. I would almost like the

Fetch size

2015-02-08 Thread CJ Woolard
Hi Kafka team, We have a use case where we need to consume from ~20 topics (each with 24 partitions), we have a potential max message size of 20MB so we've set our consumer fetch.size to 20MB but that's causing very poor performance on our consumer (most of our messages are in the 10-100k

Re: [DISCUSS] KIP-8 Add a flush method to the new Java producer

2015-02-08 Thread Gwen Shapira
Looks good to me. I like the idea of not blocking additional sends but not guaranteeing that flush() will deliver them. I assume that with linger.ms = 0, flush will just be a noop (since the queue will be empty). Is that correct? Gwen On Sun, Feb 8, 2015 at 10:25 AM, Jay Kreps

Poor performance consuming multiple topics

2015-02-08 Thread Cj
Hi Kafka team, We have a use case where we need to consume from ~20 topics (each with 24 partitions), we have a potential max message size of 20MB so we've set our consumer fetch.size to 20MB but that's causing very poor performance on our consumer (most of our messages are in the 10-100k

Apache Kafka 0.8.2 Consumer Example

2015-02-08 Thread Ricardo Ferreira
Hi, I had a client running on Kafka 0.8.1 using the High-Level consumer API working fine. Today I updated my Kafka installation for the 0.8.2 version (tried all versions of Scala) but the consumer doesn't get any messages. I tested using the kafka-console-consumer.sh utility tool and works fine,

Re: Apache Kafka 0.8.2 Consumer Example

2015-02-08 Thread Gwen Shapira
I have a 0.8.1 high level consumer working fine with 0.8.2 server. Few of them actually :) AFAIK the API did not change. Do you see any error messages? Do you have timeout configured on the consumer? What does the offset checker tool say? On Fri, Feb 6, 2015 at 4:49 PM, Ricardo Ferreira

Re: High Latency in Kafka

2015-02-08 Thread Gwen Shapira
I'm wondering how much of the time is spent by Logstash reading and processing the log vs. time spent sending data to Kafka. Also, I'm not familiar with log.stash internals, perhaps it can be tuned to send the data to Kafka in larger batches? At the moment its difficult to tell where is the

Re: Doubts Kafka

2015-02-08 Thread Christopher Piggott
Sorry, I should have read the release notes before I asked this question. The answer was in there. Internally the implementation of the offset storage is just a compacted http://kafka.apache.org/documentation.html#compaction Kafka topic ( __consumer_offsets) keyed on the consumer’s group,

Re: Doubts Kafka

2015-02-08 Thread Christopher Piggott
The consumer used Zookeeper to store offsets, in 0.8.2 there's an option to use Kafka itself for that (by setting *offsets.storage = kafka Does it still really live in zookeeper, and kafka is proxying the requests through? On Sun, Feb 8, 2015 at 9:25 AM, Gwen Shapira gshap...@cloudera.com

Re: Doubts Kafka

2015-02-08 Thread Gwen Shapira
On Sun, Feb 8, 2015 at 6:39 AM, Christopher Piggott cpigg...@gmail.com wrote: The consumer used Zookeeper to store offsets, in 0.8.2 there's an option to use Kafka itself for that (by setting *offsets.storage = kafka Does it still really live in zookeeper, and kafka is proxying the requests

Re: Doubts Kafka

2015-02-08 Thread Gwen Shapira
Hi Eduardo, 1. Why sometimes the applications prefer to connect to zookeeper instead brokers? I assume you are talking about the clients and some of our tools? These are parts of an older design and we are actively working on fixing this. The consumer used Zookeeper to store offsets, in 0.8.2

Re: Apache Kafka 0.8.2 Consumer Example

2015-02-08 Thread Ricardo Ferreira
Hi Gwen, Sorry, both the consumer and the broker are 0.8.2? A = Yes So what's on 0.8.1? A = It works fine using 0.8.1 for server AND client. You probably know the consumer group of your application. Can you use the offset checker tool on that? A = Yes, I know from the consumer, and the offset

Re: Console Producer Throwing LeaderNotAvailableException Despite Existing Leader for Partition

2015-02-08 Thread Alex Melville
Thanks Tao, That fixed the problem. Console producer now correctly pushes to the topic and console consumer can read the topic data. -Alex On Sun, Feb 8, 2015 at 1:47 AM, tao xiao xiaotao...@gmail.com wrote: Alex, I got similar error before due to incorrect network binding of my laptop's

Re: [DISCUSS] KIP-8 Add a flush method to the new Java producer

2015-02-08 Thread Jay Kreps
Well actually in the case of linger.ms = 0 the send is still asynchronous so calling flush() blocks until all the previously sent records have completed. It doesn't speed anything up in that case, though, since they are already available to send. -Jay On Sun, Feb 8, 2015 at 10:36 AM, Gwen