On 4/27/15 11:15 AM, Jay Kreps jay.kr...@gmail.com wrote:
The new producer is pretty new still so I suspect there is a fair amount
of
low-hanging performance work for anyone who wanted to take a shot at it.
I do see something. Will discuss it on a different email thread.
This sounds like flush:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API
which was recently implemented in trunk.
Joel
On Mon, Apr 27, 2015 at 08:19:40PM +, Roshan Naik wrote:
Been evaluating the perf of old and new Produce APIs for reliable
@Roshan - if the data was already written to Kafka, your approach will
generate LOTS of duplicates. I'm not convinced its ideal.
What's wrong with callbacks?
On Mon, Apr 27, 2015 at 2:53 PM, Roshan Naik ros...@hortonworks.com wrote:
@Gwen
- A failure in delivery of one or more events in the
I should have been clearer - I used Roshan's terminology in my reply.
Basically, the old producer batch Send() just took a sequence of
messages. I assumed Roshan is looking for something similar - which allows
for mixing messages for multiple partitions and therefore can fail for some
messages
Been evaluating the perf of old and new Produce APIs for reliable high volume
streaming data movement. I do see one area of improvement that the new API
could use for synchronous clients.
AFAIKT, the new API does not support batched synchronous transfers. To do
synchronous send, one needs to
Hi Gwen,
can you clarify: by batch do you mean the protocol MessageSet, or some java
client internal construct?
If the former I was under the impression that a produced MessageSet either
succeeds delivery or errors in its entirety on the broker.
Thanks,
Magnus
2015-04-27 23:05 GMT+02:00 Gwen
The important guarantee that is needed for a client producer thread is
that it requires an indication of success/failure of the batch of events
it pushed. Essentially it needs to retry producer.send() on that same
batch in case of failure. My understanding is that flush will simply flush
data from
What values do you have for below properties? Or are these set to defaults?
message.send.max.retries
retry.backoff.ms
topic.metadata.refresh.interval.ms
Thanks
Zakee
On Apr 23, 2015, at 11:48 PM, Madhukar Bharti bhartimadhu...@gmail.com
wrote:
Hi All,
Once gone through code found
As long as you retain the returned futures somewhere, you can always
iterate over the futures after the flush completes and check for
success/failure. Would that work for you?
On Mon, Apr 27, 2015 at 08:53:36PM +, Roshan Naik wrote:
The important guarantee that is needed for a client
Batch failure is a bit meaningless, since in the same batch, some records
can succeed and others may fail.
To implement an error handling logic (usually different than retry, since
the producer has a configuration controlling retries), we recommend using
the callback option of Send().
Gwen
P.S
@Gwen
- A failure in delivery of one or more events in the batch (typical Flume
case) is considered a failure of the entire batch and the client
redelivers the entire batch.
- If clients want more fine grained control, alternative option is to
indicate which events failed in the return value of
Any comments on this issue?
On Apr 24, 2015 8:05 PM, Manikumar Reddy ku...@nmsworks.co.in wrote:
We are testing new producer on a 2 node cluster.
Under some node failure scenarios, producer is not able
to update metadata.
Steps to reproduce
1. form a 2 node cluster (K1, K2)
2. create a
I checked out the Kafka 0.8.2 branch and used the ./bin/kafka-topics.sh
script to create a topic:
./bin/kafka-topics.sh -create -zookeeper zk-node:2181 -partitions 12
-replication-factor 1 -topic my-topic`
But when I use the --describe to look at the topic both the 'Leader' and
'Isr' are
On 4/27/15 2:59 PM, Gwen Shapira gshap...@cloudera.com wrote:
@Roshan - if the data was already written to Kafka, your approach will
generate LOTS of duplicates. I'm not convinced its ideal.
Only if the delivery failure rate is very high (i.e. short lived but very
frequent). This batch
I am trying to commit offset request in a background thread. I am able to
commit it so far. I am using high level consumer api.
So if I just use high level consumer api, and if I have disabled auto
commit, with kafka as the storage for offsets, will the high level consumer
api use automatically
Maybe add this to the description of
https://issues.apache.org/jira/browse/KAFKA-1843 ? I can't find it now, but
I think there was another bug where I described a similar problem -- in
some cases it makes sense to fall back to the list of bootstrap nodes
because you've gotten into a bad state and
Hi Tao,
KAFKA-2150 has been filed.
Jiangjie
On 4/24/15, 12:38 PM, tao xiao xiaotao...@gmail.com wrote:
Hi team,
I observed java.lang.IllegalMonitorStateException thrown
from AbstractFetcherThread in mirror maker when it is trying to build the
fetchrequst. Below is the error
[2015-04-23
Fine grained tracking of status of individual events is quite painful in
contrast to simply blocking on every batch. Old style Batched-sync mode
has great advantages in terms of simplicity and performance.
I may be missing something, but I'm not so convinced that it is that
painful/very
A couple of thoughts:
1. @Joel I agree it's not hard to use the new API but it definitely is more
verbose. If that snippet of code is being written across hundreds of
projects, that probably means we're missing an important API. Right now
I've only seen the one complaint, but it's worth finding
On Apr 26, 2015, at 9:03 PM, Gwen Shapira gshap...@cloudera.com wrote:
Definitely +1 for advertising this in the docs.
What I can't figure out is the upgrade path... if my application assumes
that all data for a single user is in one partition (so it subscribes to a
single partition and
Hey Jiangjie,
Yeah, not sure the bottleneck. It maybe the sender or lock contention on
the writer threads. You could use top or one of the java tools to check out
the per-thread cpu usage.
The benchmarking I had done previously showed an ability to max out the 1G
network cards we had with a
Yeah I agree we could have handled this better. I think the story we have
now is that you can override it using the partition argument in the
producer (and when we get the patch for pluggable producer we can bundle a
LegacyPartitioner or something like that).
The reason for murmur2 over 3 was
Hi Jay,
Does o.a.k.clients.tools.ProducerPerformance provide multi-thread test? I
did not find it.
I tweaked the test a little bit to make it multi-threaded and what I found
is that in a single thread case, with each message of 10 bytes, single
caller thread has ~2M messages/second throughput.
Hi,
We're using Kafka 0.8.1.1
Here is the snippet from documentation about request.required.acks
- 1, which means that the producer gets an acknowledgement after the leader
replica has received the data. This option provides better durability as the
client waits until the server
24 matches
Mail list logo