Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
the performance tool cited in the blog linked from the performance page of the website. That tool is more accurate and uses the new producer. On Fri, Apr 24, 2015 at 2:29 PM, Roshan Naik ros...@hortonworks.com wrote: Can we use the new 0.8.2 producer perf tool against a 0.8.1 broker ? -roshan On 4/24/15

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
Yes, I too notice the same behavior (with producer/consumer perf tool on 8.1.2) Š adding more threads indeed improved the perf a lot (both with and without --sync). in --sync mode batch size made almost no diff, larger events improved the perf. I was doing some 8.1.2 perf testing with a 1 node

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
Can we use the new 0.8.2 producer perf tool against a 0.8.1 broker ? -roshan On 4/24/15 1:19 PM, Jay Kreps jay.kr...@gmail.com wrote: Do make sure if you are at all performance sensitive you are using the new producer api we released in 0.8.2. -Jay On Fri, Apr 24, 2015 at 12:46 PM, Roshan

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Ewen No I did not use compression in my measurements.

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Joel, If flush() works for this use case it may be an acceptable starting point (although not as clean as a native batched sync). I am not as yet clear about some aspects of flush's batch semantics and its suitability for this mode of operation. Allow me explore it with you folks.. 1) flush()

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-27 Thread Roshan Naik
On 4/27/15 11:15 AM, Jay Kreps jay.kr...@gmail.com wrote: The new producer is pretty new still so I suspect there is a fair amount of low-hanging performance work for anyone who wanted to take a shot at it. I do see something. Will discuss it on a different email thread.

New Producer API - batched sync mode support

2015-04-27 Thread Roshan Naik
Been evaluating the perf of old and new Produce APIs for reliable high volume streaming data movement. I do see one area of improvement that the new API could use for synchronous clients. AFAIKT, the new API does not support batched synchronous transfers. To do synchronous send, one needs to

Re: New Producer API - batched sync mode support

2015-04-27 Thread Roshan Naik
, 2015 at 08:19:40PM +, Roshan Naik wrote: Been evaluating the perf of old and new Produce APIs for reliable high volume streaming data movement. I do see one area of improvement that the new API could use for synchronous clients. AFAIKT, the new API does not support batched synchronous

Re: New Producer API - batched sync mode support

2015-04-27 Thread Roshan Naik
/failure. Would that work for you? On Mon, Apr 27, 2015 at 08:53:36PM +, Roshan Naik wrote: The important guarantee that is needed for a client producer thread is that it requires an indication of success/failure of the batch of events it pushed. Essentially it needs to retry producer.send

Re: New Producer API - batched sync mode support

2015-04-27 Thread Roshan Naik
On 4/27/15 2:59 PM, Gwen Shapira gshap...@cloudera.com wrote: @Roshan - if the data was already written to Kafka, your approach will generate LOTS of duplicates. I'm not convinced its ideal. Only if the delivery failure rate is very high (i.e. short lived but very frequent). This batch

Re: New Producer API - batched sync mode support

2015-04-30 Thread Roshan Naik
@Gwen, @Ewen, While atomicity of a batch is nice to have, it is not essential. I don't think users always expect such atomicity. Atomicity is not even guaranteed in many un-batched systems let alone batched systems. As long as the client gets informed about the ones that failed in the batch..

Recording - Storm & Kafka Meetup on April 20th 2017

2017-04-21 Thread Roshan Naik
Louro (Hortonworks) - [20m] – Rethinking the Storm 2.0 Worker - Roshan Naik (Hortonworks) - [57m] – Storm in Retail Context: Catalog data processing using Kafka, Storm & Microservices - Karthik Deivasigamani (WalMart Labs) - [1h: 54m:45sec] – Schema Regi