rolling upgrade from 0.9 to 0.11

2018-03-05 Thread Sunil Parmar
Any documents / suggestions / experience that community can share about rolling upgrade from 0.9 to 0.11 and tips/ tricks you might have learned ? Our producers are c++ ( using librdkafka ) . I have tested upgrading client library to 0.9.5 works against both clusters in test environment but afraid

Re: when use kafka streams to(topic) method sometime throw error?

2018-03-05 Thread 杰 杨
it seems i don't config ProducerConfig in stream application. should I config that ? funk...@live.com From: funk...@live.com Date: 2018-03-06 11:23 To: users Subject: when use kafka streams to(topic)

when use kafka streams to(topic) method sometime throw error?

2018-03-05 Thread ? ?
hi: I meet a problem today. when I use kafka stream to consumer one topic and do mapValues() method, and to another topic then .sometimes throw an error this is code sample: new StreamsBuilder().stream(xxxtopic, Consumed.with(Serdes.String(), Serdes.String())).mapValus(method).to(newTopic).

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
Sorry Matt, I don’t have much idea about Kafka streaming (or any streaming for that matter). As for saving counts from your application servers to Aerospike directly, that is certain simpler, requiring less hardware, resources and development effort. One reason some people use Kafka as part of

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
And not to overthink this, but as I'm new to Kafka and streams I want to make sure that it makes the most sense to for my use case. With the streams and grouping, it looks like I'd be getting at 1 internal topic created per grouped stream which then would written and reread then totaled in the

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Guozhang Wang
Sounds great! :) On Mon, Mar 5, 2018 at 12:28 PM, Dmitriy Vsekhvalnov wrote: > Thanks, that's an option, i'll take a look at configuration. > > But yeah, i was thinking same, if streams relies on the fact that internal > topics should use 'CreateTime' configuration,

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Dmitriy Vsekhvalnov
Thanks, that's an option, i'll take a look at configuration. But yeah, i was thinking same, if streams relies on the fact that internal topics should use 'CreateTime' configuration, then it is streams library responsibility to configure it. I can open a Jira ticket :) On Mon, Mar 5, 2018 at

ListOffsets parameters

2018-03-05 Thread Emmett Butler
Hi users, I'm the maintainer of the PyKafka library and I'm working on improving its support for the ListOffsets API. Some questions: Kafka version: 1.0.0 I'm using this documentation for reference.

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Guozhang Wang
Hello Dmitriy, In your case, you can override this config to CreateTime only for the internal topics created by Streams, this is documented in https://kafka.apache.org/10/javadoc/org/apache/kafka/streams/StreamsConfig.html#TOPIC_PREFIX We are also discussing to always override the

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
Yep, exactly. So there is some buffering that you need to do in your client and also deal with edge cases. E.g. how long should you hold on to a batch before you send a smaller batch to producer since you want a balance between batch optimization and expedience. You may need to do some

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
Ah good call, so you are really having an AVRO wrapper around your single class right? IE an array of records, correct? Then when you hit a size you are happy you send it to the producer? On Mon, Mar 5, 2018 at 12:07 PM, Thakrar, Jayesh < jthak...@conversantmedia.com> wrote: > Good luck on

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Dmitriy Vsekhvalnov
Which effectively means given scenario is not working with LogAppendTime, correct? Because all internal re-partition topics will always contain "now" instead of real timestamp from original payload message? Is kafka-streams designed to work with LogAppendTime at all? It seems a lot of stuff will

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Guozhang Wang
If broker configures log.message.timestamp.type=LogAppendTime universally, it will ignore whatever timestamp set in the message metadata and override it with the append time. So when the messages are fetched by downstream processors which always use the metadata timestamp extractor, it will get

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Dmitriy Vsekhvalnov
Hi Guozhang, interesting, will same logic applies (internal topic rewrite) for brokers configured with: log.message.timestamp.type=LogAppendTime ? On Mon, Mar 5, 2018 at 8:33 PM, Guozhang Wang wrote: > Hello Dmitriy, > > What you have observed is by design, and it maybe

Re: kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Guozhang Wang
Hello Dmitriy, What you have observed is by design, and it maybe a bit confusing at first place. Let me explain: When you do a group-by aggregation like the above case, during the "groupBy((key, value) -> ..)" stage Streams library will do a re-partitioning by sending the original data

Offset auto-commit stops after timeout

2018-03-05 Thread ebuck
In our kafka consumer logs, we're seeing the following messages: 2018-03-05 03:57:03,350 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator kafka08:9092 (id: 2147483639 rack: null) dead for group mygroup 2018-03-05 03:57:03,350 WARN

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Thakrar, Jayesh
Good luck on your test! As for the batching within Avro and by Kafka Producer, here are my thoughts without any empirical proof. There is a certain amount of overhead in terms of execution AND bytes in converting a request record into Avro and producing (generating) a Kafka message out of it.

kafka-streams, late data, tumbling windows aggregations and event time

2018-03-05 Thread Dmitriy Vsekhvalnov
Good morning, we have simple use-case where we want to count number of events by each hour grouped by some fields from event itself. Our event timestamp is embedded into messages itself (json) and we using trivial custom timestamp extractor (which called and works as expected). What we facing

Re: Broker cannot start switch to Java9 - weird file system issue ?

2018-03-05 Thread Enrico Olivelli
Workaround: as these brokers are only for test environments I have set very small values for index file size, which affects pre-allocation segment.index.bytes=65536 log.index.size.max.bytes=65536 If anyone has some thought it will be very appreciated Cheers Enrico 2018-03-05 13:21 GMT+01:00

Re: Broker cannot start switch to Java9 - weird file system issue ?

2018-03-05 Thread Enrico Olivelli
The only fact I have found is that with Java8 Kafka is creating "SPARSE" files and with Java9 this is not true anymore Enrico 2018-03-05 12:44 GMT+01:00 Enrico Olivelli : > Hi, > This is a very strage case. I have a Kafka broker (part of a cluster of 3 > brokers) which

Re: Kafka Setup for Daily counts on wide array of keys

2018-03-05 Thread Matt Daum
Thanks for the suggestions! It does look like it's using local RocksDB stores for the state info by default. Will look into using an external one. As for the "millions of different values per grouped attribute" an example would be assume on each requests there is a parameters "X" which at the

Broker cannot start switch to Java9 - weird file system issue ?

2018-03-05 Thread Enrico Olivelli
Hi, This is a very strage case. I have a Kafka broker (part of a cluster of 3 brokers) which cannot start upgrading Java from Oracle JDK8 to Oracle JDK 9.0.4. There are a lot of .index and .timeindex files taking 10MB, they are for empty partiions. Running with Java 9 the server seems to rebuild

RE: difference between 2 options

2018-03-05 Thread adrien ruffie
Perfectly Andras ! thank a lot. I noted all of your explanations  . best regards, Adrien De : Andras Beni Envoyé : samedi 3 mars 2018 09:29:16 À : users@kafka.apache.org Objet : Re: difference between 2 options Hello Adrien, I was

Re: Setting topic's offset from the shell

2018-03-05 Thread Zoran
That should be it. Thank you very much. On 02/28/2018 06:59 PM, Manikumar wrote: we can use "kafka-consumer-groups.sh --reset-offsets" option to reset offsets. This is available from Kafka 0.11.0.0.. On Wed, Feb 28, 2018 at 2:59 PM, UMESH CHAUDHARY wrote: You might