Doubt regarding new Producer and old Producer API

2015-05-13 Thread Madhukar Bharti
Hi all, What are the possible use cases for using new producer API? - Is this only provides async call with callback feature? - Is partitioner class has been removed from new Producer API? if not then how to implement it if I want to use only client APIs? Regards, Madhukar

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mohit Gupta
Thanks Jiangjie. This is helpful. Adding to what you have mentioned, I can think of one more scenario which may not be very rare. Say, the application is rebooted and the Kafka brokers registered in the producer are not reachable ( could be due to network issues or those brokers are actually down

Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Garry Turkington
Hi, I talked with Gwen at Strata last week and promised to share some of my experiences benchmarking an app reliant on the new producer. I'm using relatively meaty boxes running my producer code (24 core/64GB RAM) but I wasn't pushing them until I got them on the same 10GB fabric as the Kafka

OutOfMemory error on broker when rolling logs

2015-05-13 Thread Jeff Field
Hello, We are doing a Kafka POC on our CDH cluster. We are running 3 brokers with 24TB (48TB Raw) of available RAID10 storage (XFS filesystem mounted with nobarrier/largeio) (HP Smart Array P420i for the controller, latest firmware) and 48GB of RAM. The broker is running with -Xmx4G -Xms4G

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Jiangjie Qin
Isn’t the producer part of the application? The metadata is stored in memory. If the application rebooted (process restarted), all the metadata will be gone. Jiangjie (Becket) Qin On 5/13/15, 9:54 AM, Mohit Gupta success.mohit.gu...@gmail.com wrote: I meant the producer. ( i.e. application

Re: Compression and batching

2015-05-13 Thread Jiangjie Qin
If you are sending in sync mode, producer will just group by partition the list of messages you provided as argument of send() and send them out. You don¹t need to worry about batch.num.messages. There is a potential that compressed message is even bigger than uncompressed message, though. I¹m

Re: Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Jiangjie Qin
Thanks for sharing this, Garry. I actually did similar tests before but unfortunately lost the test data because my laptop rebooted and I forgot to save the dataŠ Anyway, several things to verify: 1. Remember KafkaProducer holds lock per partition. So if you have only one partition in the target

Kafka log flush time outlier

2015-05-13 Thread Rajiv Kurian
I have a single broker in a cluster of 9 brokers that has a log-flush-time-99th of 260 ms or more. Other brokers have a log-flush-time-99th of less than 30 ms. The misbehaving broker is running on the same kind of machine (c3.4x on Ec2) that the other ones are running on. It's bytes-in, bytes-out,

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

2015-05-13 Thread Jiangjie Qin
Automatic preferred leader election hasn¹t been turned on in 0.8.1.1. It¹s been turned on in latest trunk though. The config name is ³auto.leader.rebalance.enable. Jiangjie (Becket) Qin On 5/13/15, 10:50 AM, Stephen Armstrong stephen.armstr...@linqia.com wrote: Does anyone have any insight

Re: Compression and batching

2015-05-13 Thread Jamie X
Jiangjie, I changed my code to group by partition, then for each partition to group mesages into up to 900kb of uncompressed data, and then sent those batches out. That worked fine and didn't cause any MessageTooLarge errors. So it looks like the issue is that the producer batches all the messages

Re: Getting NotLeaderForPartitionException in kafka broker

2015-05-13 Thread Jiangjie Qin
Does this topic exist in Zookeeper? On 5/12/15, 11:35 PM, tao xiao xiaotao...@gmail.com wrote: Hi, Any updates on this issue? I keep seeing this issue happening over and over again On Thu, May 7, 2015 at 7:28 PM, tao xiao xiaotao...@gmail.com wrote: Hi team, I have a 12 nodes cluster that

Re: Compression and batching

2015-05-13 Thread Jiangjie Qin
Yes, in old producer we don¹t control the compressed message size. In new producer, we estimate the compressed size heuristically and decide whether to close the batch or not. It is not perfect but at least better than the old one. Jiangjie (Becket) Qin On 5/13/15, 4:00 PM, Jamie X

Re: OutOfMemory error on broker when rolling logs

2015-05-13 Thread Jay Kreps
I think java.lang.OutOfMemoryError: Map failed has usually been out of address space for mmap if memory serves. If you sum the length of all .index files while the service is running (not after stopped), do they sum to something really close to 2GB? If so it is likely either that the OS/arch is

Re: Doubt regarding new Producer and old Producer API

2015-05-13 Thread Guozhang Wang
Hi Madhukar, 1. the java producer API can also be used for sync call; you can do it with producer.send().get(). 2. the partitioner class has been removed from the new producer API, instead now the message could have a specific partition id. You could calculate the partition id in your customized

the performance of producer[async] degraded seriously after full gc

2015-05-13 Thread pengfei li
Hi, Recently, we use kafka for message transport, And i found something strange in produer. We use producer in async mode, and if i trigger a full gc by jmap -histo:live, then the performance of producer degraded seriously. Then,I found there were a lot of

Re: Kafka log compression change in 0.8.2.1?

2015-05-13 Thread Guozhang Wang
Roger found another possible issue with snappy compression that during broker bouncing the snappy compressed messages could get corrupted while re-sending. I am not sure if it is related but would be good to verify after the upgrade. Guozhang On Tue, May 12, 2015 at 3:55 PM, Jun Rao

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-13 Thread Jonathan Hodges
Very good points, Gwen. I hadn't thought of Oracle Streams case of dependencies. I wonder if GoldenGate handles this better? The tradeoff of these approaches is that each RDBMS will be proprietary on how to get this CDC information. I guess GoldenGate can be a standard interface on RDBMs, but

Re: Producer garbage collection problem

2015-05-13 Thread pengfei li
Hi, I met the same problem. The scala bug https://github.com/scala/scala/pull/3450 was fixed in version 2.11, and I try the kafka_2.11-0.8.2.1.tgz which compiled with scala 2.11, there is still the same problem. Could you found the solution? Thanks 2014-02-05 0:47 GMT+08:00 Florian

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Lance Laursen
Hey folks, Any update on this? On Thu, Apr 30, 2015 at 5:34 PM, Lance Laursen llaur...@rubiconproject.com wrote: Hey all, I am attempting to create a topic which uses 8GB log segment sizes, like so: ./kafka-topics.sh --zookeeper localhost:2181 --create --topic perftest6p2r --partitions 6

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Mayuresh Gharat
I suppose it is way log management works in kafka. I am not sure the exact reason for this. Also the index files that are constructed have a mapping of relative offset to the base offset of log file to the real offset. The key value in index file is of the form Int,Int. Thanks, Mayuresh On

Re: Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Jay Kreps
Hey Garry, Super interesting. We honestly never did a ton of performance tuning on the producer. I checked the profiles early on in development and we fixed a few issues that popped up in deployment, but I don't think anyone has done a really scientific look. If you (or anyone else) want to dive

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mayuresh Gharat
By application rebooting, do you mean you bounce the brokers? Thanks, Mayuresh On Wed, May 13, 2015 at 4:06 AM, Mohit Gupta success.mohit.gu...@gmail.com wrote: Thanks Jiangjie. This is helpful. Adding to what you have mentioned, I can think of one more scenario which may not be very rare.

Re: Compression and batching

2015-05-13 Thread Jamie X
(sorry if this messes up the mailing list, I didn't seem to get replies in my inbox) Jiangjie, I am indeed using the old producer, and on sync mode. Notice that the old producer uses number of messages as batch limitation instead of number of bytes. Can you clarify this? I see a setting

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mohit Gupta
I meant the producer. ( i.e. application using the producer api to push messages into kafka ) . On Wed, May 13, 2015 at 10:20 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: By application rebooting, do you mean you bounce the brokers? Thanks, Mayuresh On Wed, May 13, 2015 at 4:06

New Producer API Design

2015-05-13 Thread Mohit Gupta
Hello, I've a question regarding the design of the new Producer API. As per the design (KafkaProducerK,V), it seems that a separate producer is required for every combination of key and value type. Where as, in documentation ( and elsewhere ) it's recommended to create a single producer instance

Re: New Producer API Design

2015-05-13 Thread Ewen Cheslack-Postava
You can of course use KafkaProducerObject, Object to get a producer interface that can accept a variety of types. For example, if you have an Avro serializer that accepts both primitive types (e.g. String, integer types) and complex types (e.g. records, arrays, maps), Object is the only type you

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

2015-05-13 Thread Stephen Armstrong
Does anyone have any insight into this? Am I correct that 0.8.1.1 should be running the leader election automatically? If this is a known issue, is there any reason not to have a cron script that runs the leader election regularly? Thanks Steve On Thu, May 7, 2015 at 2:47 PM, Stephen Armstrong

Re: New Producer API Design

2015-05-13 Thread Guozhang Wang
Hello Mohit, When we originally design the new producer API we removed the serializer / deserializer from the old producer and made it generic as accepting only messagebyte[], byte[], but we later concluded it would still be more beneficial to add the serde back into the producer API. And as you

Re: Getting NotLeaderForPartitionException in kafka broker

2015-05-13 Thread tao xiao
Hi, Any updates on this issue? I keep seeing this issue happening over and over again On Thu, May 7, 2015 at 7:28 PM, tao xiao xiaotao...@gmail.com wrote: Hi team, I have a 12 nodes cluster that has 800 topics and each of which has only 1 partition. I observed that one of the node keeps