Re: producer can't push msg sometimes with 1 broker recoved

2016-09-23 Thread kant kodali
@Fei Just curious why you guys are interested in using Kafka. I thought alcatel-lucent usually create their own software no? On Fri, Sep 23, 2016 10:36 PM, Kamal C kamaltar...@gmail.com wrote: Reduce the metadata refresh interval 'metadata.max.age.ms' from 5 min to your desired time interva

Re: producer can't push msg sometimes with 1 broker recoved

2016-09-23 Thread Kamal C
Reduce the metadata refresh interval 'metadata.max.age.ms' from 5 min to your desired time interval. This may reduce the time window of non-availability broker. -- Kamal

What is the procedure of upgrading producers from 0.8 to 0.10

2016-09-23 Thread Vadim Keylis
Hello we have a producer that is written in c language to send data to kafka using 0.8 protocol. We now need to upgrade since protocol has changed. We will upgrade broker first to 0.10 version and set log.message.format.version=0.8.1.1. What is the right approach of upgrading producer to avoid dow

Does old consumer api supports 0.10 message format after jar update ?

2016-09-23 Thread Vadim Keylis
Hello Everyone. We are in the process of upgrading our brokers and consumers from 0.8.1.1 to 0.10. Our consumer use 0.8 high level or simple consumer api. Will old consumer api support new message format introduced in 0.10 after we upgrade consumer jars to 0.10? Thanks so much in advance

Re: micro-batching in kafka streams

2016-09-23 Thread Guozhang Wang
One way that I can think of, is to add an index suffix on the key to differentiate records with the same keys, so your can still store records not as a list but as separate entries on KV store like: ... And then when punctuating, you can still scan the whole store or do a range query based on

RE: specifying exact number of records to fetch

2016-09-23 Thread Ramanan, Buvana (Nokia - US)
Thanks, Manikumar! I find that max.poll.records feature is introduced in 0.10 release. And we are still working with Kafka 0.9. -Original Message- From: Manikumar [mailto:manikumar.re...@gmail.com] Sent: Friday, September 23, 2016 1:31 PM To: users@kafka.apache.org Subject: Re: specifyin

Re: specifying exact number of records to fetch

2016-09-23 Thread Manikumar
"max.poll.records" config property can be used to limit the number of records returned in each consumer poll() method call. On Fri, Sep 23, 2016 at 10:49 PM, Ramanan, Buvana (Nokia - US) < buvana.rama...@nokia-bell-labs.com> wrote: > Hello, > > Do Kafka protocol & KafkaConsumer (java) client addr

specifying exact number of records to fetch

2016-09-23 Thread Ramanan, Buvana (Nokia - US)
Hello, Do Kafka protocol & KafkaConsumer (java) client address the following need? The caller specifies that it needs N number of records with a max wait time of Tn milliseconds. If N records are available within Tn, the records are returned to the caller. If Tn expires, then the caller gets w

consumer.id config ignored in kafka 0.10

2016-09-23 Thread Vali Dumitru
Hi guys, Seems like the consumer.id property is ignored in kafka 0.10. It is also missing from the 0.10 consumer documentation. Is this correct ? If yes, is there another way (via code maybe ?) to manually specify a consumer id. In various test scenarios, we need to run the same consumer process

Re: micro-batching in kafka streams

2016-09-23 Thread Srikanth
Guozhang, The example works well for aggregate operations. How can we achieve this if processing has to be in Micro-batching? One way will be to store the incoming records in a List type KV store and process it in punctuate. With the current KV stores, that would mean (de)serializing a list. Which

Re: Benchmarking kafka performance

2016-09-23 Thread Kaufman Ng
Kafka includes producer and consumer performance test tools: - kafka-producer-perf-test - kafka-consumer-perf-test You can find out about more background on the tools here (note that the details are dated): https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-

Re: Benchmarking kafka performance

2016-09-23 Thread Kenny Gorman
Vadim, We mostly made this little script as a joke. Remember the unix utility ‘yes’? It does in fact work if you want to simply direct some random load at Kafka to test things. Throw it into Docker and run a bunch of them. ;-) https://github.com/Eventador/evtools/tree/master/yesbench In terms

Re: Schema for jsonConverter

2016-09-23 Thread Enrico Olivelli
Hi, I'm trying to use the Confluent JDBC Sink as Sri is doing but without a schema. I do not want to write "schema" + "payload" for each record as my records are all for the same table and the schema is not going to change (this is a very simple project) is there a way to configure a fixed schema

Kafka connect 2.0.1 - ByteArrayConverter ?

2016-09-23 Thread Olivier Girardot
Hi everyone,is there any way to use a straightforward converter instead of the AvroConverter for avro data, because the NullPointerException (https://github.com/confluentinc/kafka-connect-hdfs/issues/36 and https://github.com/confluentinc/schema-registry/issues/272)  is quite blocking and an upgrad

RE: Does Kafka Sync/persist every message from a publisher by default?

2016-09-23 Thread Tauzell, Dave
If by "sync" you mean "fsync" then, no it does not. There are some properties: log.flush.interval.messages log.flush.interval.ms In theory you could set log.flush.interval.messages to 1 to fsync with each write. I haven't tried this to see what happens but I expect performance will drop quit

RE: why did Kafka choose pull instead of push for a consumer ?

2016-09-23 Thread Tauzell, Dave
Kafka writes each message but the OS is writing those to in memory disk cache. Kafka periodically calls fsync() to tell the OS to force the disk cache to actual disk. Kafka gets high availability by replicating messages to other brokers so that the messages are in-memory on several machines at

Re: why did Kafka choose pull instead of push for a consumer ?

2016-09-23 Thread kant kodali
@Gerard Here are my initial benchmarks Producer on Machine 1 (m4.xlarge on AWS)Broker on Machine 2 (m4.xlarge on AWS) Consumer on Machine 3 (m4.xlarge on AWS) Data size 1.2KB Receive throughtput: ~24K Kafka Receive throughput ~58K (same exact configuration) All the benchmarks I ran are with default

Re: Does Kafka 0.9 can guarantee not loss data

2016-09-23 Thread Kafka
Oh please ignore my last reply. I find if leaderReplica.highWatermark.messageOffset >= requiredOffset , this can ensure all replicas’ leo in curInSyncReplicas is >= the requiredOffset. > 在 2016年9月23日,下午3:39,Kafka 写道: > > OK, the example before is not enough to exposure problem. > What will ha

Re: Does Kafka 0.9 can guarantee not loss data

2016-09-23 Thread Kafka
OK, the example before is not enough to exposure problem. What will happen to the situation under the numAcks is 1,and curInSyncReplica.size >= minIsr,but in fact the replica in curInSyncReplica only have one replica has caught up to leader, and this replica is the leader replica itself,this is n

Re: why did Kafka choose pull instead of push for a consumer ?

2016-09-23 Thread Gerard Klijs
I haven't tried it myself, nor very likely will in the near future, but since it's also distributed I guess that with a large enough cluster you will be able to handle any load. One of the things kafka might be better at is more connecters available, a better at least once guarantee, better monitor