Hi Guys,
I have some doubts about the Kafka, the first is Why sometimes the applications
prefer to connect to zookeeper instead brokers? Connecting to zookeeper could
create an overhead, because we are inserting other element between producer and
consumer. Another question is about the
Hi,I have a 2 nodes kafka cluster with 2 instances of brokers and zookeepers.
And then I create a topic kafka-test with 2 partitions and replication-factor
=2. My producer config is: {partitioner.class
kafka.producer.DefaultPartitioner
Sounds like this patch fixed the issue. It would be good to get some review
on KAFKA-1919--it is only a four line change.
On Wed, Feb 4, 2015 at 1:15 PM, Steven Wu stevenz...@gmail.com wrote:
Bhavesh,
unfortunately, ps cmd in Mac doesn't display thread id. I tried DTrace, but
it only shows
Hey Otis,
Yeah, Gwen is correct. The future from the send will be satisfied when the
response is received so it will be exactly the same as the performance of
the sync producer previously.
-Jay
On Mon, Feb 2, 2015 at 1:34 PM, Gwen Shapira gshap...@cloudera.com wrote:
If I understood the code
Howdy all,
I recently upgraded to Kafka 0.8.2.0 and am trying to verify that
everything still works as expected. I spin up two brokers, one zk instance,
and then create a topic using
kafka-topics.sh --create --zookeeper ad-0104:2181 --topic deleteme
--partitions 2 --replication-factor 1
Then I
I implemented the flush() call I hypothesized earlier in this thread as a
patch on KAFKA-1865.
So now
producer.flush()
will block until all buffered requests complete. The post condition is that
all previous send futures are satisfied and have error/offset information.
This is a little easier
You can configure the jmx ports by using below system properties.
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.rmi.port=8889
On Fri, Feb 6, 2015 at 9:19 AM, Xinyi Su xiny...@gmail.com wrote:
Hi,
I try to use Jconsole to connect remote Kafka broker which is
my understanding is that for 2 consumers within the same group, a message
goes to only one of the consumers, not both.
but I did a simple test: get the kafka distro, copy the code dir to
/tmp/kafka/
then run console consumer from both dirs at the same time
bin/kafka-console-consumer.sh --.
Hi,
bin/kafka-console-consumer.sh --.
all the parameters are the same
You need to set same group.id to create a consumer group. By default
console consumer creates a random group.id.
You can set group.id by using --consumer.config /tmp/comsumer.props
flag.
$$echo group.id=1
Hi All,
I am having some log files of around 30GB, I am trying to event process
these logs by pushing them to Kafka. I could clearly see the throughput
achieved while publishing these event to Kafka is quiet slow.
So as mentioned for the single log file of 30GB, the Logstash is
continuously
Alex,
I got similar error before due to incorrect network binding of my laptop's
wireless interface. You can try with setting advertised.host.name=kafka's
server hostname in the server.properties and run it again.
On Sun, Feb 8, 2015 at 8:38 AM, Alex Melville amelvi...@g.hmc.edu wrote:
Howdy
Steve
In terms of mimicing the sync behavior, I think that is what .get() does,
no?
We are always returning the offset and error information. The example I
gave didn't make use of it, but you definitely can make use of it if you
want to.
-Jay
On Wed, Feb 4, 2015 at 9:58 AM, Steve Morin
Hi Joel,
The mbean approach depends upon storing traffic stats over a period of
time, so is doable but complex. I am thinking about another approach
to check logfiles time-stamp (to check producer activity) in
conjunction with the mtime for topic offsets (consumers activity). So
if both timestamp
Following up on our previous thread on making batch send a little easier,
here is a concrete proposal to add a flush() method to the producer:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API
A proposed implementation is here:
Have there been any changes with 0.8.2 in how the marker gets moved when
you use the high-level consumer?
One problem I have always had was: what if I pull something from the
stream, but then I have an error in processing it? I don't really want to
move the marker.
I would almost like the
Hi Kafka team,
We have a use case where we need to consume from ~20 topics (each with 24
partitions), we have a potential max message size of 20MB so we've set our
consumer fetch.size to 20MB but that's causing very poor performance on our
consumer (most of our messages are in the 10-100k
Looks good to me.
I like the idea of not blocking additional sends but not guaranteeing that
flush() will deliver them.
I assume that with linger.ms = 0, flush will just be a noop (since the
queue will be empty). Is that correct?
Gwen
On Sun, Feb 8, 2015 at 10:25 AM, Jay Kreps
Hi Kafka team,
We have a use case where we need to consume from ~20 topics (each with 24
partitions), we have a potential max message size of 20MB so we've set our
consumer fetch.size to 20MB but that's causing very poor performance on our
consumer (most of our messages are in the 10-100k
Hi,
I had a client running on Kafka 0.8.1 using the High-Level consumer API
working fine.
Today I updated my Kafka installation for the 0.8.2 version (tried all
versions of Scala) but the consumer doesn't get any messages. I tested
using the kafka-console-consumer.sh utility tool and works fine,
I have a 0.8.1 high level consumer working fine with 0.8.2 server. Few of
them actually :)
AFAIK the API did not change.
Do you see any error messages? Do you have timeout configured on the
consumer? What does the offset checker tool say?
On Fri, Feb 6, 2015 at 4:49 PM, Ricardo Ferreira
I'm wondering how much of the time is spent by Logstash reading and
processing the log vs. time spent sending data to Kafka. Also, I'm not
familiar with log.stash internals, perhaps it can be tuned to send the data
to Kafka in larger batches?
At the moment its difficult to tell where is the
Sorry, I should have read the release notes before I asked this question.
The answer was in there.
Internally the implementation of the offset storage is just a compacted
http://kafka.apache.org/documentation.html#compaction Kafka topic (
__consumer_offsets) keyed on the consumer’s group,
The consumer used Zookeeper to store offsets, in 0.8.2 there's an option
to use Kafka itself for that (by setting *offsets.storage = kafka
Does it still really live in zookeeper, and kafka is proxying the requests
through?
On Sun, Feb 8, 2015 at 9:25 AM, Gwen Shapira gshap...@cloudera.com
On Sun, Feb 8, 2015 at 6:39 AM, Christopher Piggott cpigg...@gmail.com
wrote:
The consumer used Zookeeper to store offsets, in 0.8.2 there's an option
to use Kafka itself for that (by setting *offsets.storage = kafka
Does it still really live in zookeeper, and kafka is proxying the requests
Hi Eduardo,
1. Why sometimes the applications prefer to connect to zookeeper instead
brokers?
I assume you are talking about the clients and some of our tools?
These are parts of an older design and we are actively working on fixing
this. The consumer used Zookeeper to store offsets, in 0.8.2
Hi Gwen,
Sorry, both the consumer and the broker are 0.8.2?
A = Yes
So what's on 0.8.1?
A = It works fine using 0.8.1 for server AND client.
You probably know the consumer group of your application. Can you use the
offset checker tool on that?
A = Yes, I know from the consumer, and the offset
Thanks Tao,
That fixed the problem. Console producer now correctly pushes to the topic
and console consumer can read the topic data.
-Alex
On Sun, Feb 8, 2015 at 1:47 AM, tao xiao xiaotao...@gmail.com wrote:
Alex,
I got similar error before due to incorrect network binding of my laptop's
Well actually in the case of linger.ms = 0 the send is still asynchronous
so calling flush() blocks until all the previously sent records have
completed. It doesn't speed anything up in that case, though, since they
are already available to send.
-Jay
On Sun, Feb 8, 2015 at 10:36 AM, Gwen
28 matches
Mail list logo