Re: resources for simple consumer?

2015-07-15 Thread Ewen Cheslack-Postava
Hi Jeff, The simple consumer hasn't really changed, the info you found should still be relevant. The wiki page on it might be the most useful reference for getting started: https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example And if you want a version all setup to

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Ewen Cheslack-Postava
Also worth mentioning is that the new producer doesn't have this behavior -- it will round robin over available partitions for records without keys. Available means it currently has a leader -- under normal cases this means it distributes evenly across all partitions, but if a partition is down

Re: latency performance test

2015-07-15 Thread Ewen Cheslack-Postava
The tests are meant to evaluate different things and the way they send messages is the source of the difference. EndToEndLatency works with a single message at a time. It produces the message then waits for the consumer to receive it. This approach guarantees there is no delay due to queuing. The

Re: Load Balancing Kafka

2015-07-15 Thread Terry Bates
Greetings Sandy, Folks smarter than me can correct me if I am wrong. Using Python client you don't have to connect to Zookeeper, so just specifying one of the brokers should be sufficient. In terms of what happens to your messages as your client produces them, they should be randomly assigned to

Re: Load Balancing Kafka

2015-07-15 Thread Jiangjie Qin
If you have pretty balanced traffic on each partition and have set auto.leader.rebalance.enabled to true or false, you might not need to do further workload balance. However, in most cases you probably still need to do some sort of load balancing based on the traffic and disk utilization of each

Re: Load Balancing Kafka

2015-07-15 Thread Jiangjie Qin
AhŠ It seems you are more focusing on producer side workload balanceŠ If that is the case, please ignore my previous comments. Jiangjie (Becket) Qin On 7/15/15, 6:01 PM, Jiangjie Qin j...@linkedin.com wrote: If you have pretty balanced traffic on each partition and have set

Re: Latency test

2015-07-15 Thread Tao Feng
(Please correct me if I am wrong.) Based on TestEndToEndLatency( https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala), consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer config(

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from

Re: Latency test

2015-07-15 Thread Yuheng Du
Tao, If I am running on the command line the following command bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m It promped that it is not correct. So where should I put the -Xmx1024m option? Thanks. On Wed, Jul 15, 2015

Re: Latency test

2015-07-15 Thread Yuheng Du
I got java out of heap error when running end to end latency test: yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3 5000 100 1 Exception in thread main java.lang.OutOfMemoryError: Java heap space

ExecutionException instead of UnknownTopicOrPartitionException

2015-07-15 Thread Jean-Charles Jabouille
Hi, I'm currently developing an application to use Kafka in Java. My application just push an offer synchronously in a topic. I have 3 brokers and 3 zookeeper instance. I want to catch Exception in order my process does not crash but try to retry and do some code for specific exception. So I

latency performance test

2015-07-15 Thread Yuheng Du
Hi, I have run the end to end latency test and the producerPerformance test on my kafka cluster according to https://gist.github.com/jkreps/c7ddb4041ef62a900e6c In end to end latency test, the latency was around 2ms. In producerperformance test, if use batch size 8196 to send 50,000,000 records:

Re: Latency test

2015-07-15 Thread Yuheng Du
Tao, Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c is outdated already. The error message shows: USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect topic num_messages consumer_fetch_max_wait producer_acks Can anyone helps me what should be

Re: Offset not committed

2015-07-15 Thread Vadim Bobrov
thanks Joel and Jiangjie, I have figured it out. In addition to my log4j 2 config file I also needed a log4j 1 config file, then it works. Let me trace what happens when the offsets are not committed and report back On Wed, Jul 15, 2015 at 1:33 PM, Joel Koshy jjkosh...@gmail.com wrote: - You

Re: Offset not committed

2015-07-15 Thread Vadim Bobrov
caught it, thanks for help! any ideas what to do? TRACE 2015-07-15 18:37:58,070 [chaos-akka.actor.jms-dispatcher-1019 ] kafka.network.BoundedByteBufferSend - 113 bytes written. ERROR 2015-07-15 18:37:58,078 [chaos-akka.actor.jms-dispatcher-1019 ] kafka.consumer.ZookeeperConsumerConnector -

Re: Java API for fetching Consumer group from Kafka Server(Not Zookeeper)

2015-07-15 Thread Jiangjie Qin
It looks kafka.admin.ConsumerGroupCommand class is what you need. Jiangjie (Becket) Qin On 7/14/15, 8:23 PM, Swati Suman swatisuman1...@gmail.com wrote: Hi Team, Currently, I am able to fetch the Topic,Partition,Leader,Log Size through TopicMetadataRequest API available in Kafka. Is there any

Re: Latency test

2015-07-15 Thread Yuheng Du
I have run the end to end latency test and the producerPerformance test on my kafka cluster according to https://gist.github.com/jkreps/c7ddb4041ef62a900e6c In end to end latency test, the latency was around 2ms. In producerperformance test, if use batch size 8196 to send 50,000,000 records:

Kafka BrokerTopicMetrics MessageInPerSec rate

2015-07-15 Thread pushkar priyadarshi
Hi, While benchmarking new producer and consumer syncing offset in zookeeper i see that MessageInRate reported in BrokerTopicMetrics is not same as rate at which i am able to publish and consume messages. Using my own custom reporter i can see the rate at which messages are published and

Re: Offset not committed

2015-07-15 Thread Jiangjie Qin
I am not sure how your project was setup. But I think it depends on what log4j property file you specified when you started your application. Can you check if you have log4j appender defined and the loggers are directed to the correct appender? Thanks, Jiangjie (Becket) Qin On 7/15/15, 8:10 AM,

Re: Offset not committed

2015-07-15 Thread Joel Koshy
- You can also change the log4j level dynamically via the kafka.Log4jController mbean. - You can also look at offset commit request metrics (mbeans) on the broker (just to check if _any_ offset commits are coming through during the period you see no moving offsets). - The alternative is to

kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
In kafka performance tests https://gist.github.com/jkreps /c7ddb4041ef62a900e6c The TestEndtoEndLatency results are typically around 2ms, while the ProducerPerformance normally has average latencyaround several hundres ms when using batch size 8196. Are both results talking about end to end

Re: ExecutionException instead of UnknownTopicOrPartitionException

2015-07-15 Thread Guozhang Wang
Hi Jean-Charles, The FutureRecordMetadata.get() will always throw an ExecutionException(Throwable: CauseException), so in your code should look like sth.: --- try { future = producer.send(..); } catch (KafkaException e) { // handle any KafkaException that is not

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Guozhang Wang
Yuheng, Only TestEndtoEndLatency's number are end to end, for ProducerPerformance the latency is for the send-to-ack latency, which increases as batch size increases. Guozhang On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: In kafka performance tests

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Hi Geoffrey, Thank you for your helpful information. Do I have to install the virtual machines? I am using Mac as the testdriver machine or I can use a linux machine to run testdriver too. Thanks. best, Yuheng On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson ge...@confluent.io wrote: Hi

Re: Offset not committed

2015-07-15 Thread Jiangjie Qin
Is there anything on the broker log? Is it possible that your client and broker are not running on the same version? Jiangjie (Becket) Qin On 7/15/15, 11:40 AM, Vadim Bobrov vadimbob...@gmail.com wrote: caught it, thanks for help! any ideas what to do? TRACE 2015-07-15 18:37:58,070

Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
I have following problem, I tried almost everything I could but without any luck All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 consumers. All I want is to send 1M of messages via producer to these 10 consumers. I am using built Kafka 0.8.3 from current upstream so I have

Re: Offset not committed

2015-07-15 Thread Vadim Bobrov
there are lots of files under logs directory of the broker, just in case I checked all modified around the time of error and found nothing unusual both client and broker are 0.8.2.1 could it have something to do with running it in the cloud? we are on Linode and I remember having random

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
Guozhang, Thank you for explaining. I see that in ProducerPerformance call back functions were used to get the latency metrics. For the TestEndtoEndLatency, does message size matter? What this end-to-end latency comprise of, besides transferring a package from source to destination (typically

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
Thanks. Here is the source code snippet of EndtoEndLatency test: for (i- 0 until numMessages) { val begin = System.nanoTime producer.send( new ProducerRecord[Array[Byte],Array[Byte]](topic, message)) val received = iter.next val elapsed = System.nanoTime - begin // poor man's progress bar if (i

Re: kafka benchmark tests

2015-07-15 Thread Geoffrey Anderson
Hi Yuheng, Running these tests requires a tool we've created at Confluent called 'ducktape', which you need to install with the command: pip install ducktape==0.2.0 Running the tests locally requires some setup (creation of virtual machines etc.) which is outlined here:

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread JIEFU GONG
This is a total shot in the dark here so please ignore this if it fails to make sense, but I remember that on some previous implementation of the producer prior to when round-robin was enabled, producers would send messages to only one of the partitions for a set period of time (configurable, I

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
I think I figured it out. I had to use custom parititioner which does basically nothing. Even I used it before, it was not taken into consideration because I was sending KeyedMessage without any key. Just partition and payload. Now I am doing it like this: producer.send(new KeyedMessageString,

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
Nice one! That might be it as well. Do you have an idea what is that configuration parameter called? On Thu, Jul 16, 2015 at 12:53 AM, JIEFU GONG jg...@berkeley.edu wrote: This is a total shot in the dark here so please ignore this if it fails to make sense, but I remember that on some previous

resources for simple consumer?

2015-07-15 Thread Jeff Gong
hey all, typically i've only ever had to use the high level consumer for my personal needs when handling data. recently, however, I have found the need to be more selective and careful with managing offsets and want the extended capability to do so. i know that there is a bit of documentation on

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
Maybe there is some reason why produce sticks with a partition for some period of time - mostly performance related. I can imagine that constant switching between partitions can be kind of slow in such sense that producer has to refocus on another partition to send a message to and this switching

Re: Offset not committed

2015-07-15 Thread Jiangjie Qin
I’m not sure if it is related to running in cloud. Do you see this disconnection issue always happening on committing offsets or it happens randomly? Jiangjie (becket) qin On 7/15/15, 12:53 PM, Vadim Bobrov vadimbob...@gmail.com wrote: there are lots of files under logs directory of the broker,

Re: kafka benchmark tests

2015-07-15 Thread Geoffrey Anderson
Hi Yuheng, Yes, you should be able to run on either mac or linux. The test cluster consists of a test-driver machine and some number of slave machines. Right now, there are roughly two ways to set up the slave machines: 1) Slave machines are virtual machines *on* the test-driver machine. 2)

Re: Offset not committed

2015-07-15 Thread Vadim Bobrov
it is pretty random On Wed, Jul 15, 2015 at 4:22 PM, Jiangjie Qin j...@linkedin.com.invalid wrote: I’m not sure if it is related to running in cloud. Do you see this disconnection issue always happening on committing offsets or it happens randomly? Jiangjie (becket) qin On 7/15/15, 12:53

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Hi Geoffrey, Thank you for your detailed explaining. They are really helpful. I am thinking of going after the second way, since I have bare metal access to all the nodes in the cluster, it's probably better to run real slave machines instead of virtual machines. (correct me if I am wrong) Each

Re: Offset not committed

2015-07-15 Thread Jiangjie Qin
If that is the case, I guess that might still be some value to try to run broker and clients locally and see if the issue still exist. Thanks, Jiangjie (Becket) Qin On 7/15/15, 1:23 PM, Vadim Bobrov vadimbob...@gmail.com wrote: it is pretty random On Wed, Jul 15, 2015 at 4:22 PM, Jiangjie Qin

Re: Latency test

2015-07-15 Thread Yuheng Du
Thank you, Tao! On Jul 15, 2015 6:27 PM, Tao Feng fengta...@gmail.com wrote: Sorry Yufeng, You should change it in $KAFKA_HEAP_OPTS. On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng fengta...@gmail.com wrote: Hi Yuheng, You could add the -Xmx1024m in

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Lance Laursen
From the FAQ: To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one. So, if there are

Load Balancing Kafka

2015-07-15 Thread Sandy Waters
Hi all, Do I need to load balance against the brokers? I am using the python driver and it seems to only want a single kafka broker host. However, in a situation where I have 10 brokers, is it still fine to just give it one host. Does zookeeper and kafka handle the load balancing and redirect

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Jagbir Hooda
Hi Stefan, Have you looked at the following output for message distribution across the topic-partitions and which topic-partition is consumed by which consumer thread? kafaka-server/bin./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group consumer_group_name

Re: Latency test

2015-07-15 Thread Tao Feng
Hi Yuheng, You could add the -Xmx1024m in https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh KAFKA_JVM_PERFORMANCE_OPTS. On Wed, Jul 15, 2015 at 12:51 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Tao, If I am running on the command line the following command

Re: Latency test

2015-07-15 Thread Tao Feng
Sorry Yufeng, You should change it in $KAFKA_HEAP_OPTS. On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng fengta...@gmail.com wrote: Hi Yuheng, You could add the -Xmx1024m in https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh KAFKA_JVM_PERFORMANCE_OPTS. On Wed, Jul 15, 2015 at

Kafka HL Consumer stops periodically

2015-07-15 Thread Marina
Hi, I have a weird problem when processing high volumes of data through my 3-node, 3 topic, 4 partition Kafka cluster. I am not fully sure at which point the issue starts happening, but basically, after some time of processing lots of events (100 M or so in about 5 hr time span) with no issues,

Re: Offset not committed

2015-07-15 Thread Vadim Bobrov
Thanks Jiangjie, unfortunately turning trace level on does not seem to work (any log level actually) I am using log4j2 (through slf4j) and despite including log4j1 bridge and these lines: Logger name=org.apache.kafka level=trace/ Logger name=kafka level=trace/ in my conf file I could not

Re: Consumer that consumes only local partition?

2015-07-15 Thread Robert Metzger
Hi Shef, did you resolve this issue? I'm facing some performance issues and I was wondering whether reading locally would resolve them. On Mon, Jun 22, 2015 at 11:43 PM, Shef she...@yahoo.com wrote: Noob question here. I want to have a single consumer for each partition that consumes only the