Nilesh,
It's expected that a lot of memory is used for cache. This makes sense
because under the hood, Kafka mostly just reads and writes data to/from
files. While Kafka does manage some in-memory data, mostly it is writing
produced data (or replicated data) to log files and then serving those
Hello Jason,
Thanks for reply!
About your proposal, in general case it might be helpful. In my case it
will not help much - I'm allowing each ConsumerRecord or subset of
ConsumerRecords to be processed and ACKed independently and out of HLC
process/thread (not to block partition), and then
Correct me if I m wrong. If compaction is used +1 to indicate next offset
is no longer valid. For the compacted section the offset is not increasing
sequentially. i think you need to call the next offset of the last
processed record to figure out what the next offset will be
On Wed, 29 Jul 2015
I haven’t found a way to specify that a consumer should read from the
beginning, or from any other explicit offset, or that the offset should be
“reset” in any way. The command-line shell scripts (which I believe simply
wrap the Scala tools) have flags for this sort of thing. Is there any way
Hi Keith,
you can use the `auto_offset_reset` parameter to kafka-python's
KafkaConsumer. It behaves the same as the java consumer configuration of
the same name. See
http://kafka-python.readthedocs.org/en/latest/apidoc/kafka.consumer.html#kafka.consumer.KafkaConsumer.configure
for more details on
It seems less weird if you think of the offset as the position of the
consumer, i.e. it is on record 5. In some sense the consumer is actually
in between records, i.e. if it has processed 4 and not processed 5 do you
think about your position as being on 4 or on 5? Well not on 4 because it
already
Hi Kafka experts,
I am trying to understand how Kafka new producer works. From the Kafka new
producer code/comment, it indicates the producer is thread safe which we
could have multiple clients per KafkaProducer. Is RecordAccumulator the
only place to take care thread synchronization for
I would be using the servers available at my place of work. I dont have
access to AWS servers. I would starting off with a small number of nodes in
the cluster and then plot a graph with x-axis as the number of servers in
the cluster and y-axis as the number of topics with partitions, before the
I think these would definitely be useful statistics to have and I've tried
to do similar tests! The biggest difference is probably going to be the
hardware specs on whatever cluster you decide to run it on. Maybe
benchmarks performed on different AWS servers would be helpful too, but I'd
like to
@Jiefu Gong,
Are the results of your tests available publicly?
Regards,
Prabhjot
On Tue, Jul 28, 2015 at 10:35 PM, Prabhjot Bharaj prabhbha...@gmail.com
wrote:
I would be using the servers available at my place of work. I dont have
access to AWS servers. I would starting off with a small
I'm trying to get a basic consumer off the ground. I can create the consumer
but I can't do anything at the message level:
consumer = KafkaConsumer(topic,
group_id=group_id,
bootstrap_servers=[ip + : + port])
for m in consumer:
print x
Hi,
I'm using Mirror Maker with a cluster of 3 nodes and cluster of 5 nodes.
I would like to ask - is the number of nodes a restriction for Mirror Maker?
Also, are there any other restrictions or properties that should be common
across both the clusters so that they continue mirroring.
I'm
Hi all,
Currently working on building some tools revolving around the Simple
Consumer, and I was wondering if anyone had any good documentation or
examples for it? I am aware that this:
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
exists but it doesn't give any
Hi Ewen,
I am using 3 brokers with 12 topic and near about 120-125 partitions
without any replication and the message size is approx 15MB/message.
The problem is when the cache memory increases and reaches to the max
available the performance starts degrading also i am using Storm spot as
This won't be very helpful as I am not too experienced with Python or your
use case, but Java Consumers to indeed have to create a String out of the
byte array returned from a successful consumption like:
String actualmsg = new String(messageAndMetadata.message())
Consult something like this as
Hi Keith,
kafka-python raises FailedPayloadsError on unspecified server failures.
Typically this is caused by a server exception that results in a 0 byte
response. Have you checked your server logs?
-Dana
On Tue, Jul 28, 2015 at 2:01 PM, JIEFU GONG jg...@berkeley.edu wrote:
This won't be
Can you confirm that there are indeed messages in the topic that you
published to?
bin/kafka-console-consumer.sh --zookeeper [details] --topic [topic]
--from-beginning
That should be the right command, and you can use that to first verify that
messages have indeed been published to the topic in
Thank you. It looks like I had the 'topic' slightly wrong. I didn't realize it
was case-sensitive. I got past that error, but now I'm bumping up against
another error:
Traceback (most recent call last):
...
File /usr/local/lib/python2.7/dist-packages/kafka/consumer/kafka.py, line
59, in
18 matches
Mail list logo