Re: Potential memory leak in rocksdb

2017-02-20 Thread Pierre Coquentin
Hi Sachin, So, I have reconfigured to use 6 consumers, each managing only one partition. As you can see on the picture, the memory is still growing over time but very slowly. It seems the number of partitions have an impact on how fast the memory increases. For now, we will use only the in memory

Implementing a non-key in Kafka Streams using the Processor API

2017-02-20 Thread Frank Lyaruu
Hi all, I'm trying to implement joining two Kafka tables using a 'remote' key, basically as described here: https://cwiki.apache.org/confluence/display/KAFKA/Discussion%3A+Non-key+KTable-KTable+Joins Under the "Implementation Details" there is one line I don't know how to do: 1. First of

Re: Where is offset recorded?

2017-02-20 Thread Jean Changyi
Hi, Praveev Yeah, it's clear. Thank you very much. Best regards --- Jean jeanking...@gmail.com > from: Praveen [mailto:praveev...@gmail.com] > date:

Need guidance to create a pipeline

2017-02-20 Thread Raymond Xie
Hello. I am new to Kafka. I am wondering how to read log using kafka and get it parsed in spark. Please correct me if I am wrong: I want to create a model (pipeline) which takes files dropped in a specific folder or hive or whatever storage, and use the file as the input of kafka producer; On

Re: Re: Where is offset recorded?

2017-02-20 Thread Jean Changyi
Yeah, it's clear. Thank you very much! Best regards --- Jean jeanking...@gmail.com > from: Praveen [mailto:praveev...@gmail.com] > to: users@kafka.apache.org

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-20 Thread Matthias J. Sax
Hi, thanks for updating the KIP. Couple of follow up comments: * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by time" option -- IMHO it belongs to "reset by position"? * Nit: Description of "Reset to Earliest" > using Kafka Consumer's `auto.offset.reset` to `earliest` I

Re: JMX metrics for replica lag time

2017-02-20 Thread Jun Ma
Hi Guozhang, Thanks for your replay. Could you tell me which one indicates the lag between follower and leader for a specific partition? Thanks, Jun On Mon, Feb 20, 2017 at 4:57 PM, Guozhang Wang wrote: > I don't think the metrics have been changed in 0.9.0.1, in fact even

Re: Where is offset recorded?

2017-02-20 Thread Praveen
Kafka used to use zookeeper for managing offsets. But I believe it has since changed to storing offset in a separate topic called __consumer_offsets. This info is there in the documentation. See here: https://kafka.apache.org/090/documentation.html#impl_offsettracking On Mon, Feb 20, 2017 at 5:02

Where is offset recorded?

2017-02-20 Thread Jean Changyi
Hi, Thank you for reading my question. I'm still using kafka-2.1.1-0.9.0.0. From some article I learned that offsets were stored in zookeeper, however I can only find some z-nodes about console-consumer in my zookeeper. My question is where is mine offset record? That means where can I find

Re: JMX metrics for replica lag time

2017-02-20 Thread Guozhang Wang
I don't think the metrics have been changed in 0.9.0.1, in fact even in 0.10.x they are still the same as stated in: https://kafka.apache.org/documentation/#monitoring The mechanism for determine which followers have been dropped out of ISR has changed, but the metrics are not. Guozhang On

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-20 Thread Guozhang Wang
First about the metrics attributes, now I remembered there is indeed a change as in https://cwiki.apache.org/confluence/display/KAFKA/KIP-105%3A+Addition+of+Recording+Level+for+Sensors We have added a hierarchy to the sensors, and currently there are only two levels: INFO and DEBUG. Along with

Re: How does one deploy to consumers without causing re-balancing for real time use case?

2017-02-20 Thread Onur Karaman
We've only started using kafka-based group coordination for small and simple use cases at LinkedIn so far. Given that you kill -9 your process, your explanation for the long stabilization time makes sense. I'd recommend calling KafkaConsumer.close. It should speed up the rebalance times. Another

Re: Security Documentation contradiction / misleading ?

2017-02-20 Thread Stephane Maarek
Hi Martin, I’m having a bit of trouble reading your email. Per my understanding, if you set zookeeper.set.acl=true, then the zk nodes have to be owned by the kafka brokers. Say in a two broker setting kafka/kafka1.example.com & kafka/ kafk2.example.com , you’re going to have an issue. If I do

Kafka to connect to Azure Data Lake

2017-02-20 Thread Anup.Bansal
Hi, I am new to Kafka and am evaluating a case to use Kafka as means of sending data from our on-premise data stores to the Data Lake on Azure PaaS. I could only find two connectors I could use both of which are indirect means of getting data to the Data Lake: 1. Kafka Connect for Azure IoT Hub

metric.reporters vs. kafka.metrics.reporters?

2017-02-20 Thread Mohammad Kargar
Can someone shed some light on the difference between "metric.reporters" and "kafka.metrics.reporters" when it comes to configuring metrics for brokers? Thanks, Mohammad

Re: Security Documentation contradiction / misleading ?

2017-02-20 Thread Martin Gainty
MG>confusion between JAAS-security terminology and Kafka-SASL terminology? From: Stephane Maarek Sent: Sunday, February 19, 2017 7:28 PM To: users@kafka.apache.org Subject: Security Documentation contradiction / misleading ? Hi,