Re: Consumer Offsets and Open FDs

2016-07-13 Thread Tom Crayford
Hi, You're running into the issue in https://issues.apache.org/jira/plugins/servlet/mobile#issue/KAFKA-3894 and possibly https://issues.apache.org/jira/plugins/servlet/mobile#issue/KAFKA-3587 (which is fixed in 0.10). Sadly right now there's no way to know how high a dedupe buffer size you need -

Re: Streams Compatibility

2016-07-13 Thread Sharninder
requires 0.10 On Thu, Jul 14, 2016 at 6:08 AM, Matt Anderson wrote: > Is the new Kafka Streams API compatible with Kafka 0.9.x API and Broker or > does it require v0.10.x? > > Thanks, > Matt > -- -- Sharninder

Re: KStream-to-KStream Join Example

2016-07-13 Thread vivek thakre
Yes, there are same number of partitions to both the topic, also same partition key i.e userId If I just join the streams without applying the map functions (in this case userClickStream and userEventSrtream) , it works. Thanks, Vivek On Wed, Jul 13, 2016 at 4:53 PM, Philippe Derome

KafkaConsumer poll(timeout) doesn't seem to work as expected

2016-07-13 Thread Josh Goodrich
The poll(timeout) method of the Java KafkaConsumer API doesn’t behave the way you would think. If you create a new Consumer with a groupId that has been seen before, even if there are new events in the topic if you issue a poll(0) it never returns any records. I find I have to put in a loop of 2

Streams Compatibility

2016-07-13 Thread Matt Anderson
Is the new Kafka Streams API compatible with Kafka 0.9.x API and Broker or does it require v0.10.x? Thanks, Matt

Re: KStream-to-KStream Join Example

2016-07-13 Thread Philippe Derome
Did you specify same number of partitions for the two input topics you are joining? I think that this is usually the first thing people ask to verify with errors similar to yours. If you are experimenting with learning some concepts, it is simpler to always use one partition for your topics. On

KStream-to-KStream Join Example

2016-07-13 Thread vivek thakre
Hello, I want to join 2 Topics (KStreams) Stream 1 Topic : userIdClicks Key : userId Value : JSON String with event details Stream 2 Topic : userIdChannel Key : userId Value : JSON String with event details and has channel value I could not find any examples with KStream-to-KStream Join.

kafka-connect-hdfs offset out of range behaviour

2016-07-13 Thread Prabhu V
The kafka-connect-hdfs just hangs if the "offset" that it expects is no longer present (this happens when the message get deleted because of retention time) The process in this case does not write any output and the messages get ignored. Is this by design ? The relevant code is

Re: Read all record from a Topic.

2016-07-13 Thread James Cheng
Jean-Baptiste, I wrote a blog post recently on this exact subject. https://logallthethings.com/2016/06/28/how-to-read-to-the-end-of-a-kafka-topic/ Let me know if you find it useful. -James Sent from my iPhone > On Jul 13, 2016, at 7:16 AM, g...@netcourrier.com wrote: > > Hi, > > > I'm

Re: Role of Producer

2016-07-13 Thread Snehal Nagmote
Rest Api is for you to have the standard interface , so that you can hide the implementation details of pushing data to Kafka topic . You don't need number of producers same as data providers . You can have one Producer to send data to Kafka topic . Also , if you have requirement of dividing

Re: Building API to make Kafka reactive

2016-07-13 Thread Dean Wampler
You don't have the Scala library on the app class path, which is used to implement Akka. Use the same version that's required for the Akka libraries you're using. http://mvnrepository.com/artifact/org.scala-lang/scala-library dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition

KTable DSL join

2016-07-13 Thread Srikanth
Hello, I'm trying the following join using KTable. There are two change log tables. Table1 111 -> aaa 222 -> bbb 333 -> aaa Table2 aaa -> 999 bbb -> 888 ccc -> 777 My result table should be 111 -> 999 222 -> 888 333 -> 999 Its not a case for join() as the keys don't match.

RE: Role of Producer

2016-07-13 Thread Luo, Chao
Hi Snehal, Thanks for your input. They already have their own Java APIs to access data. But why do I create rest API? What is the benefits? Say, if there are 500 data providers, do I need 500 producers at my end to collect data? At least, the number of producers should be proportional to

Re: Role of Producer

2016-07-13 Thread Michael Freeman
MQ was just short hand for IBM MQ or Active MQ etc etc On Wed, Jul 13, 2016 at 9:42 PM, Luo, Chao wrote: > Hi thanks! > > Yes, I agree it is the best if they can use a kafka producer client. But I > need to discuss with them if they will accept that. > > Btw, what is MQ? > >

RE: Role of Producer

2016-07-13 Thread Luo, Chao
Hi thanks! Yes, I agree it is the best if they can use a kafka producer client. But I need to discuss with them if they will accept that. Btw, what is MQ? -Original Message- From: Michael Freeman [mailto:mikfree...@gmail.com] Sent: Wednesday, July 13, 2016 3:36 PM To:

Re: Role of Producer

2016-07-13 Thread Snehal Nagmote
Hi Chao , To solve this problem , I can think of creating rest api . Your end point can have one of the parameter as data provider if you want to send it to different topics based on data provider . On backend , when you get data , you can send it to Kafka Topics, using Kafka Producer at the

Re: Role of Producer

2016-07-13 Thread Michael Freeman
Could you write them a client that uses the Kafka producer? You could also write some restful services that send the data to kafka. If they use MQ you could listen to MQ and send to Kafka. On Wed, Jul 13, 2016 at 9:31 PM, Luo, Chao wrote: > Dear Kafka guys, > > I just

Role of Producer

2016-07-13 Thread Luo, Chao
Dear Kafka guys, I just started to build up a Kafka system two weeks ago. Here I have a question about how to design/implement the producer. In my system, there are many data providers. I need to collect real-time data from them and store it in a NoSQL database. The problem is that different

Java 0.9.0.1 Consumer Does not failover

2016-07-13 Thread Michael Freeman
Hi, I'm running a Kafka cluster with 3 nodes. I have a topic with a replication factor of 3. When I stop node 1 running kafka-topics.sh shows me that node 2 and 3 have successfully failed over the partitions for the topic. The message producers are still sending messages and I can still

Re: Consumer Offsets and Open FDs

2016-07-13 Thread Rakesh Vidyadharan
We ran into this as well, and I ended up with the following that works for us. log.cleaner.dedupe.buffer.size=536870912 log.cleaner.io.buffer.size=2000 On 13/07/2016 14:01, "Lawrence Weikum" wrote: >Apologies. Here is the full trace from a broker: > >[2016-06-24

Re: Contribution : KafkaStreams CEP library

2016-07-13 Thread Guozhang Wang
Added to the eco-system page, thanks for your sharing again! Cheers, Guozhang On Mon, Jul 11, 2016 at 12:40 PM, Florian Hussonnois wrote: > Hi, > > It would be very great if you can link my repo. Thank very much. > > 2016-07-11 18:26 GMT+02:00 Guozhang Wang

Re: Consumer Offsets and Open FDs

2016-07-13 Thread Lawrence Weikum
Apologies. Here is the full trace from a broker: [2016-06-24 09:57:39,881] ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) java.lang.IllegalArgumentException: requirement failed: 9730197928 messages in segment __consumer_offsets-36/.log but offset

Re: Kafka Streams reducebykey and tumbling window - need final windowed KTable values

2016-07-13 Thread Srikanth
Thanks. This would be useful in place where we use a key-value store just to duplicate a KTable for get() operations. Any rough idea when this is targeted for release? Its still not clear how to use this for the case this thread was started for. Does Kafka Stream keep windows alive forever? At

Re: Building API to make Kafka reactive

2016-07-13 Thread Shekar Tippur
Is there anyway I can get a small working example to start with? - Shekar On Wed, Jul 13, 2016 at 10:39 AM, Shekar Tippur wrote: > Dean, > > I am having trouble getting this to work. > > import akka.actor.ActorSystem; > import akka.kafka.scaladsl.Producer; > import

Re: Building API to make Kafka reactive

2016-07-13 Thread Shekar Tippur
Dean, I am having trouble getting this to work. import akka.actor.ActorSystem; import akka.kafka.scaladsl.Producer; import akka.stream.javadsl.Source; import akka.kafka.ProducerSettings; import org.apache.kafka.clients.producer.ProducerRecord; import

Re: Consumer Offsets and Open FDs

2016-07-13 Thread Manikumar Reddy
Can you post the complete error stack trace? Yes, you need to restart the affected brokers. You can tweak log.cleaner.dedupe.buffer.size, log.cleaner.io.buffer.size configs. Some related JIRAs: https://issues.apache.org/jira/browse/KAFKA-3587 https://issues.apache.org/jira/browse/KAFKA-3894

Re: Consumer Offsets and Open FDs

2016-07-13 Thread Lawrence Weikum
Oh interesting. I didn’t know about that log file until now. The only error that has been populated among all brokers showing this behavior is: ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) Then we see many messages like this: INFO Compaction for partition

Re: Consumer Offsets and Open FDs

2016-07-13 Thread Manikumar Reddy
Hi, Are you seeing any errors in log-cleaner.log? The log-cleaner thread can crash on certain errors. Thanks Manikumar On Wed, Jul 13, 2016 at 9:54 PM, Lawrence Weikum wrote: > Hello, > > We’re seeing a strange behavior in Kafka 0.9.0.1 which occurs about every > other

Consumer Offsets and Open FDs

2016-07-13 Thread Lawrence Weikum
Hello, We’re seeing a strange behavior in Kafka 0.9.0.1 which occurs about every other week. I’m curious if others have seen it and know of a solution. Setup and Scenario: - Brokers initially setup with log compaction turned off - After 30 days, log compaction was turned on

Read all record from a Topic.

2016-07-13 Thread gibe
Hi, I'm using a compacted Kafka Topic to save the state of my application. When the application crashes/restarts I can restore its state by reading the Kafka topic. However I need to read it completely, especially up to most recent record, to be sure to restore all data. Is there a

Re: Need a help in understanding __consumer_offsets topic creation in Kafka Cluster

2016-07-13 Thread Prasannalakshmi Sugumaran
Hi, We are using Kafka 0.9.0.2, and by default log cleaner is not enabled. When we enable the log cleaner, internal topic “__consumer_offsets” (around 1TB in size)starts compaction, and during compaction we are unable to consume/produce messages. Also, consumer groups failed in leader election.

Re: NetFlow metrics to Kafka

2016-07-13 Thread Michael Noll
Mathieu, yes, this is possible. In a past project of mine we have been doing this, though I wasn't directly involved with coding the Cisco-Kafka part. As far as I know there aren't ready-to-use Netflow connectors available (for Kafka Connect), so you most probably have to write your own

Re: Kafka Consumer Group Id bug?

2016-07-13 Thread Spico Florin
Hello! For me it seems that rebalance is the cause that some of the messages are consumed by either one consumer or another. If you are using random client id, then it could happen that during the rebalance a client id will get a "lower" position than the consumer that was previously consumed

Re: Kafka Consumer Group Id bug?

2016-07-13 Thread Gerard Klijs
Are you sure the topic itself has indeed 1 partition? If so the only partition should be matched to either one till some error/rebalance occurs, does this indeed happen (a lot)? On Wed, Jul 13, 2016 at 7:19 AM BYEONG-GI KIM wrote: > Hello. > > I'm not sure whether it's a bug