Hi Jeff,
The simple consumer hasn't really changed, the info you found should still
be relevant. The wiki page on it might be the most useful reference for
getting started:
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
And if you want a version all setup to
Also worth mentioning is that the new producer doesn't have this behavior
-- it will round robin over available partitions for records without keys.
Available means it currently has a leader -- under normal cases this
means it distributes evenly across all partitions, but if a partition is
down
The tests are meant to evaluate different things and the way they send
messages is the source of the difference.
EndToEndLatency works with a single message at a time. It produces the
message then waits for the consumer to receive it. This approach guarantees
there is no delay due to queuing. The
Greetings Sandy,
Folks smarter than me can correct me if I am wrong. Using Python client you
don't have to connect to Zookeeper, so just specifying one of the brokers
should be sufficient. In terms of what happens to your messages as your
client produces them, they should be randomly assigned to
If you have pretty balanced traffic on each partition and have set
auto.leader.rebalance.enabled to true or false, you might not need to do
further workload balance.
However, in most cases you probably still need to do some sort of load
balancing based on the traffic and disk utilization of each
AhŠ It seems you are more focusing on producer side workload balanceŠ If
that is the case, please ignore my previous comments.
Jiangjie (Becket) Qin
On 7/15/15, 6:01 PM, Jiangjie Qin j...@linkedin.com wrote:
If you have pretty balanced traffic on each partition and have set
(Please correct me if I am wrong.) Based on TestEndToEndLatency(
https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala),
consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer config(
Jiefu,
Have you tried to run benchmark_test.py? I ran it and it asks me for the
ducktape.services.service
yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py
Traceback (most recent call last):
File benchmark_test.py, line 16, in module
from
Tao,
If I am running on the command line the following command
bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092
192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m
It promped that it is not correct. So where should I put the -Xmx1024m
option? Thanks.
On Wed, Jul 15, 2015
I got java out of heap error when running end to end latency test:
yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ bin/kafka-run-class.sh
kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3
5000 100 1
Exception in thread main java.lang.OutOfMemoryError: Java heap space
Hi,
I'm currently developing an application to use Kafka in Java. My application
just push an offer synchronously in a topic. I have 3 brokers and 3 zookeeper
instance. I want to catch Exception in order my process does not crash but try
to retry and do some code for specific exception.
So I
Hi,
I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
In end to end latency test, the latency was around 2ms. In
producerperformance test, if use batch size 8196 to send 50,000,000 records:
Tao,
Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
is outdated already. The error message shows:
USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect
topic num_messages consumer_fetch_max_wait producer_acks
Can anyone helps me what should be
thanks Joel and Jiangjie,
I have figured it out. In addition to my log4j 2 config file I also needed
a log4j 1 config file, then it works. Let me trace what happens when the
offsets are not committed and report back
On Wed, Jul 15, 2015 at 1:33 PM, Joel Koshy jjkosh...@gmail.com wrote:
- You
caught it, thanks for help!
any ideas what to do?
TRACE 2015-07-15 18:37:58,070 [chaos-akka.actor.jms-dispatcher-1019 ]
kafka.network.BoundedByteBufferSend - 113 bytes written.
ERROR 2015-07-15 18:37:58,078 [chaos-akka.actor.jms-dispatcher-1019 ]
kafka.consumer.ZookeeperConsumerConnector -
It looks kafka.admin.ConsumerGroupCommand class is what you need.
Jiangjie (Becket) Qin
On 7/14/15, 8:23 PM, Swati Suman swatisuman1...@gmail.com wrote:
Hi Team,
Currently, I am able to fetch the Topic,Partition,Leader,Log Size through
TopicMetadataRequest API available in Kafka.
Is there any
I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
In end to end latency test, the latency was around 2ms. In
producerperformance test, if use batch size 8196 to send 50,000,000 records:
Hi,
While benchmarking new producer and consumer syncing offset in zookeeper i
see that MessageInRate reported in BrokerTopicMetrics is not same as rate
at which i am able to publish and consume messages.
Using my own custom reporter i can see the rate at which messages are
published and
I am not sure how your project was setup. But I think it depends on what
log4j property file you specified when you started your application. Can
you check if you have log4j appender defined and the loggers are directed
to the correct appender?
Thanks,
Jiangjie (Becket) Qin
On 7/15/15, 8:10 AM,
- You can also change the log4j level dynamically via the
kafka.Log4jController mbean.
- You can also look at offset commit request metrics (mbeans) on the
broker (just to check if _any_ offset commits are coming through
during the period you see no moving offsets).
- The alternative is to
In kafka performance tests https://gist.github.com/jkreps
/c7ddb4041ef62a900e6c
The TestEndtoEndLatency results are typically around 2ms, while the
ProducerPerformance normally has average latencyaround several hundres ms
when using batch size 8196.
Are both results talking about end to end
Hi Jean-Charles,
The FutureRecordMetadata.get() will always throw an
ExecutionException(Throwable: CauseException), so in your code should look
like sth.:
---
try {
future = producer.send(..);
} catch (KafkaException e) {
// handle any KafkaException that is not
Yuheng,
Only TestEndtoEndLatency's number are end to end, for ProducerPerformance
the latency is for the send-to-ack latency, which increases as batch size
increases.
Guozhang
On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du yuheng.du.h...@gmail.com
wrote:
In kafka performance tests
Hi Geoffrey,
Thank you for your helpful information. Do I have to install the virtual
machines? I am using Mac as the testdriver machine or I can use a linux
machine to run testdriver too.
Thanks.
best,
Yuheng
On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson ge...@confluent.io
wrote:
Hi
Is there anything on the broker log?
Is it possible that your client and broker are not running on the same
version?
Jiangjie (Becket) Qin
On 7/15/15, 11:40 AM, Vadim Bobrov vadimbob...@gmail.com wrote:
caught it, thanks for help!
any ideas what to do?
TRACE 2015-07-15 18:37:58,070
I have following problem, I tried almost everything I could but without any luck
All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 consumers.
All I want is to send 1M of messages via producer to these 10 consumers.
I am using built Kafka 0.8.3 from current upstream so I have
there are lots of files under logs directory of the broker, just in case I
checked all modified around the time of error and found nothing unusual
both client and broker are 0.8.2.1
could it have something to do with running it in the cloud? we are on
Linode and I remember having random
Guozhang,
Thank you for explaining. I see that in ProducerPerformance call back
functions were used to get the latency metrics.
For the TestEndtoEndLatency, does message size matter? What this end-to-end
latency comprise of, besides transferring a package from source to
destination (typically
Thanks. Here is the source code snippet of EndtoEndLatency test:
for (i- 0 until numMessages) { val begin = System.nanoTime producer.send(
new ProducerRecord[Array[Byte],Array[Byte]](topic, message)) val received =
iter.next val elapsed = System.nanoTime - begin // poor man's progress bar
if (i
Hi Yuheng,
Running these tests requires a tool we've created at Confluent called
'ducktape', which you need to install with the command:
pip install ducktape==0.2.0
Running the tests locally requires some setup (creation of virtual machines
etc.) which is outlined here:
This is a total shot in the dark here so please ignore this if it fails to
make sense, but I remember that on some previous implementation of the
producer prior to when round-robin was enabled, producers would send
messages to only one of the partitions for a set period of time
(configurable, I
I think I figured it out.
I had to use custom parititioner which does basically nothing.
Even I used it before, it was not taken into consideration because I
was sending KeyedMessage without any key. Just partition and payload.
Now I am doing it like this:
producer.send(new KeyedMessageString,
Nice one! That might be it as well. Do you have an idea what is that
configuration parameter called?
On Thu, Jul 16, 2015 at 12:53 AM, JIEFU GONG jg...@berkeley.edu wrote:
This is a total shot in the dark here so please ignore this if it fails to
make sense, but I remember that on some previous
hey all,
typically i've only ever had to use the high level consumer for my personal
needs when handling data. recently, however, I have found the need to be
more selective and careful with managing offsets and want the extended
capability to do so. i know that there is a bit of documentation on
Maybe there is some reason why produce sticks with a partition for
some period of time - mostly performance related. I can imagine that
constant switching between partitions can be kind of slow in such
sense that producer has to refocus on another partition to send a
message to and this switching
I’m not sure if it is related to running in cloud. Do you see this
disconnection issue always happening on committing offsets or it happens
randomly?
Jiangjie (becket) qin
On 7/15/15, 12:53 PM, Vadim Bobrov vadimbob...@gmail.com wrote:
there are lots of files under logs directory of the broker,
Hi Yuheng,
Yes, you should be able to run on either mac or linux.
The test cluster consists of a test-driver machine and some number of slave
machines. Right now, there are roughly two ways to set up the slave
machines:
1) Slave machines are virtual machines *on* the test-driver machine.
2)
it is pretty random
On Wed, Jul 15, 2015 at 4:22 PM, Jiangjie Qin j...@linkedin.com.invalid
wrote:
I’m not sure if it is related to running in cloud. Do you see this
disconnection issue always happening on committing offsets or it happens
randomly?
Jiangjie (becket) qin
On 7/15/15, 12:53
Hi Geoffrey,
Thank you for your detailed explaining. They are really helpful.
I am thinking of going after the second way, since I have bare metal access
to all the nodes in the cluster, it's probably better to run real slave
machines instead of virtual machines. (correct me if I am wrong)
Each
If that is the case, I guess that might still be some value to try to run
broker and clients locally and see if the issue still exist.
Thanks,
Jiangjie (Becket) Qin
On 7/15/15, 1:23 PM, Vadim Bobrov vadimbob...@gmail.com wrote:
it is pretty random
On Wed, Jul 15, 2015 at 4:22 PM, Jiangjie Qin
Thank you, Tao!
On Jul 15, 2015 6:27 PM, Tao Feng fengta...@gmail.com wrote:
Sorry Yufeng, You should change it in $KAFKA_HEAP_OPTS.
On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng fengta...@gmail.com wrote:
Hi Yuheng,
You could add the -Xmx1024m in
From the FAQ:
To reduce # of open sockets, in 0.8.0 (
https://issues.apache.org/jira/browse/KAFKA-1017), when the partitioning
key is not specified or null, a producer will pick a random partition and
stick to it for some time (default is 10 mins) before switching to another
one. So, if there are
Hi all,
Do I need to load balance against the brokers? I am using the python
driver and it seems to only want a single kafka broker host. However, in a
situation where I have 10 brokers, is it still fine to just give it one
host. Does zookeeper and kafka handle the load balancing and redirect
Hi Stefan,
Have you looked at the following output for message distribution
across the topic-partitions and which topic-partition is consumed by
which consumer thread?
kafaka-server/bin./kafka-run-class.sh
kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group
consumer_group_name
Hi Yuheng,
You could add the -Xmx1024m in
https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh
KAFKA_JVM_PERFORMANCE_OPTS.
On Wed, Jul 15, 2015 at 12:51 AM, Yuheng Du yuheng.du.h...@gmail.com
wrote:
Tao,
If I am running on the command line the following command
Sorry Yufeng, You should change it in $KAFKA_HEAP_OPTS.
On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng fengta...@gmail.com wrote:
Hi Yuheng,
You could add the -Xmx1024m in
https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh
KAFKA_JVM_PERFORMANCE_OPTS.
On Wed, Jul 15, 2015 at
Hi, I have a weird problem when processing high volumes of data through my
3-node, 3 topic, 4 partition Kafka cluster.
I am not fully sure at which point the issue starts happening, but basically,
after some time of processing lots of events (100 M or so in about 5 hr time
span) with no issues,
Thanks Jiangjie,
unfortunately turning trace level on does not seem to work (any log level
actually) I am using log4j2 (through slf4j) and despite including log4j1
bridge and these lines:
Logger name=org.apache.kafka level=trace/
Logger name=kafka level=trace/
in my conf file I could not
Hi Shef,
did you resolve this issue?
I'm facing some performance issues and I was wondering whether reading
locally would resolve them.
On Mon, Jun 22, 2015 at 11:43 PM, Shef she...@yahoo.com wrote:
Noob question here. I want to have a single consumer for each partition
that consumes only the
49 matches
Mail list logo