Re: Kafka encryption

2016-06-02 Thread Jim Hoagland
I'm hesitant to cite it because it wasn't really a proper benchmark, but with the end-to-end encryption through Kafka proof of concept described at http://symc.ly/1pC2CEG, doing the encryption added only 26% to the time taken to send messages and only 6% to the time taken to consume messages. This

RE: Problematic messages in Kafka

2016-06-02 Thread Thakrar, Jayesh
Thanks for the quick reply Danny. The message size as per the DumpLogSegments is around 59KB I used a very high message.max.size and a high fetchsize of 1 MB (that's the message.max.size in the broker) and still the same hang behavior. Also tried a max-wait-ms so that the consumer does not

Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
I see. I'm not sure if this is a known issue. Do you mind share the brokers/topics setup and the steps to reproduce this issue? Cheers, Steve On Fri, Jun 3, 2016, 9:45 AM safique ahemad wrote: > you got it right... > > But DialTimeout is not a concern here. Client try

Re: Problematic messages in Kafka

2016-06-02 Thread Danny Bahir
quoting from https://cwiki.apache.org/confluence/display/KAFKA/FAQ The high-level consumer will block if the next message available is larger than the maximum fetch size you have specified - One possibility of a stalled consumer is that the fetch size in the consumer is smaller than the

Re: Dynamic bootstrap.servers with multiple data centers

2016-06-02 Thread Danny Bahir
Yes, I'm in. Sent from my iPhone > On Jun 2, 2016, at 8:32 AM, Ismael Juma wrote: > > Hi Danny, > > A KIP has not been drafted for that yet. Would you be interested in working > on it? > > Ismael > >> On Thu, Jun 2, 2016 at 1:15 PM, Danny Bahir

Re: Does the Kafka Streams DSL support non-Kafka sources/sinks?

2016-06-02 Thread Christian Posta
Hate to bring up "non-flashy" technology... but Apache Camel would be a great fit for something like this. Two java libraries each with very strong suits. On Thu, Jun 2, 2016 at 6:09 PM, Avi Flax wrote: > On 6/2/16, 07:03, "Eno Thereska"

Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread safique ahemad
you got it right... But DialTimeout is not a concern here. Client try fetching metadata from Kafka brokers but Kafka give them stale metadata near 30-40 sec. It try to fetch 3-4 time in between until it get updated metadata. This is completely different problem than

Re: Does the Kafka Streams DSL support non-Kafka sources/sinks?

2016-06-02 Thread Avi Flax
On 6/2/16, 07:03, "Eno Thereska" wrote: > Using the low-level streams API you can definitely read or write to arbitrary > locations inside the process() method. Ah, good to know — thank you! > However, back to your original question: even with the low-level streams >

Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
So you are coming from https://github.com/Shopify/sarama/issues/661 , right? I'm not sure if anything from broker side can help but looks like you already found DialTimeout on client side can help? Cheers, Steve On Fri, Jun 3, 2016, 8:33 AM safique ahemad wrote: > kafka

Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread safique ahemad
kafka version:0.9.0.0 go sarama client version: 1.8 On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian wrote: > Client version? > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad wrote: > > > Hi All, > > > > We are using Kafka broker cluster in our data

Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
Client version? On Fri, Jun 3, 2016, 4:44 AM safique ahemad wrote: > Hi All, > > We are using Kafka broker cluster in our data center. > Recently, It is realized that when a Kafka broker goes down then client try > to refresh the metadata but it get stale metadata upto

Re: Unavailable Partitions and Uneven ISR

2016-06-02 Thread Russ Lavoie
Have you verified that the old leader of the partition is using the same I'd as before? Check in zk /brokers/ids to get a list of available brokers. I would use the reassignment tool to move partition 3 to brokers in the list from zk (specifying only 3 brokers). Make sure to include broker with

Re: broker randomly shuts down

2016-06-02 Thread Russ Lavoie
What about in dmesg? I have run into this issue and it was the OOM killer. I also ran into a heap issue using too much of the direct memory (JVM). Reducing the fetcher threads helped with that problem. On Jun 2, 2016 12:19 PM, "allen chan" wrote: > Hi Tom, > >

Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread safique ahemad
Hi All, We are using Kafka broker cluster in our data center. Recently, It is realized that when a Kafka broker goes down then client try to refresh the metadata but it get stale metadata upto near 30 seconds. After near 30-35 seconds, updated metadata is obtained by client. This is really a

Problematic messages in Kafka

2016-06-02 Thread Thakrar, Jayesh
Wondering if anyone has encountered similar issues. Using Kafka 0.8.2.1. Occasionally, we encounter a situation in which a consumer (including kafka-console-consumer.sh) just hangs. If I increment the offset to skip the offending message, things work fine again. I have been able to identify

Re: Restoring Kafka data to one broker

2016-06-02 Thread Meghana Narasimhan
Hi All, Any suggestions or inputs on this ? Any help would be greatly appreciated. Thanks, Meghana On Wed, Jun 1, 2016 at 3:01 PM, Meghana Narasimhan < mnarasim...@bandwidth.com> wrote: > Hi, > I have a 3 node cluster with kafka version 0.9.0.1 with many topics having > replication factor 3 and

Re: broker randomly shuts down

2016-06-02 Thread allen chan
Hi Tom, That is one of the first things that i checked. Active memory never goes above 50% of overall available. File cache uses the rest of the memory but i do not think that causes OOM killer. Either way there is no entries in /var/log/messages (centos) to show OOM is happening. Thanks On

Re: Create KTable from two topics

2016-06-02 Thread Srikanth
I did try approach 3 yesterday with the following sample data. topic1: 127339 538433 131933 626026 128072 536012 128074 546262 *123507 517631* 128073 542361 128073 560608 topic2: 128074 100282 131933 100394 127339 100445 128073 100710 *123507 100226* I joined these and printed the

Re: Track progress of kafka stream job

2016-06-02 Thread Srikanth
Matthias, """bin/kafka-consumer-groups.sh --zookeeper localhost:2181/kafka10 --list""" output didn't show the group I used in streams app. Also, AbstractTask.java had a commit() API. That made me wonder if offset management was overridden too. I'm trying out KafkaStreams for one new streaming

Re: Create KTable from two topics

2016-06-02 Thread Matthias J. Sax
I would not expect a performance difference. -Matthias On 06/02/2016 06:15 PM, Srikanth wrote: > In terms of performance there is not going to be much difference to+table > vs through+aggregateByKey rt? > > Srikanth > > > On Thu, Jun 2, 2016 at 9:21 AM, Matthias J. Sax

Re: Create KTable from two topics

2016-06-02 Thread Srikanth
In terms of performance there is not going to be much difference to+table vs through+aggregateByKey rt? Srikanth On Thu, Jun 2, 2016 at 9:21 AM, Matthias J. Sax wrote: > Hi Srikanth, > > your third approach seems to be the best fit. It uses only one shuffle > of the

RE: Changing default logger to RollingFileAppender (KAFKA-2394)

2016-06-02 Thread Tauzell, Dave
The RollingFileAppender is required to use in production. -Dave -Original Message- From: Dustin Cote [mailto:dus...@confluent.io] Sent: Thursday, June 02, 2016 9:51 AM To: users@kafka.apache.org Subject: Re: Changing default logger to RollingFileAppender (KAFKA-2394) Just to clarify,

Re: Avro deserialization

2016-06-02 Thread Rick Mangi
Thanks! > On May 31, 2016, at 1:00 PM, Michael Noll wrote: > > FYI: I fixed the docs of schema registry (vProps -> props). > > Best, Michael > > > On Tue, May 31, 2016 at 2:05 AM, Rick Mangi wrote: > >> That was exactly the problem, I found the

Re: How to use HDP kafka?

2016-06-02 Thread Igor Kravzov
I used Ambari automatic deployment and it does not require additional configurations unless you need some changes. On Wed, Jun 1, 2016 at 9:15 PM, Shaolu Xu wrote: > Hi All, > > I used the latest HDP 2.4 version. > Did you do some configuration before used HDP? I

Re: Changing default logger to RollingFileAppender (KAFKA-2394)

2016-06-02 Thread Dustin Cote
Just to clarify, do you mean you are using the RollingFileAppender in production, or the naming convention for DailyRollingFileAppender is required by your production systems? On Thu, Jun 2, 2016 at 10:49 AM, Andrew Otto wrote: > +1, this is what Wikimedia uses in

Re: Changing default logger to RollingFileAppender (KAFKA-2394)

2016-06-02 Thread Andrew Otto
+1, this is what Wikimedia uses in production. On Thu, Jun 2, 2016 at 10:38 AM, Tauzell, Dave wrote: > I haven't started using this in production but this is how I will likely > setup the logging as it is easier to manage. > > -Dave > > -Original Message-

RE: Changing default logger to RollingFileAppender (KAFKA-2394)

2016-06-02 Thread Tauzell, Dave
I haven't started using this in production but this is how I will likely setup the logging as it is easier to manage. -Dave -Original Message- From: Dustin Cote [mailto:dus...@confluent.io] Sent: Thursday, June 02, 2016 9:33 AM To: users@kafka.apache.org; d...@kafka.apache.org Subject:

Changing default logger to RollingFileAppender (KAFKA-2394)

2016-06-02 Thread Dustin Cote
Hi all, I'm looking at changing the Kafka default logging setup to use the RollingFileAppender instead of the DailyRollingFileAppender in an effort to accomplish two goals: 1) Avoid filling up users' disks if the log files grow unexpectedly 2) Move off the admittedly unreliable

Re: Dynamic bootstrap.servers with multiple data centers

2016-06-02 Thread Enrico Olivelli
I am on the same situation. I use zookeeper to publish kafka broker endpoints for dynamic discovery. Il Gio 2 Giu 2016 14:33 Ismael Juma ha scritto: > Hi Danny, > > A KIP has not been drafted for that yet. Would you be interested in working > on it? > > Ismael > > On Thu, Jun

Re: Kafka broker slow down when consumer try to fetch large messages from topic

2016-06-02 Thread Tom Crayford
Hi there, Firstly, a note that Kafka isn't really designed for this kind of large message. http://ingest.tips/2015/01/21/handling-large-messages-kafka/ covers a lot of tips around this use case however, and covers some tuning that will likely improve your usage. In particular, I expect tuning up

Re: Change Topic Name

2016-06-02 Thread Todd Palino
With the caveat that I’ve never tried this before... I don’t see a reason why this wouldn’t work. There’s no topic information that’s encoded in the log segments, as far as I’m aware. And there’s no information about offsets stored in Zookeeper. So in theory, you should be able to shut down the

Re: Best monitoring tool for Kafka in production

2016-06-02 Thread Gerard Klijs
Not that I have anything against paying for monitoring, or against Confluent, but you will need your consumers to be using kafka 1.10 is you want to make most out of the confluent solution. We currently are using zabbix, it's free, and it has complete functionality in one product. It does can be a

Re: Track progress of kafka stream job

2016-06-02 Thread Matthias J. Sax
Hi Srikanth, I am not exactly sure if I understand your question correctly. One way to track the progress is to get the current record offset (you can obtain it in the low lever Processor API via the provided Context object). Otherwise, on commit, all writes to intermediate topics are flushed

Re: Create KTable from two topics

2016-06-02 Thread Matthias J. Sax
Hi Srikanth, your third approach seems to be the best fit. It uses only one shuffle of the data (which you cannot prevent in any case). If you want to put everything into a single application, you could use a "dummy" custom aggregation to convert the KStream into a KTable instead of writing into

Re: broker randomly shuts down

2016-06-02 Thread Tom Crayford
That looks like somebody is killing the process. I'd suspect either the linux OOM killer or something else automatically killing the JVM for some reason. For the OOM killer, assuming you're on ubuntu, it's pretty easy to find in /var/log/syslog (depending on your setup). I don't know about other

Re: Dynamic bootstrap.servers with multiple data centers

2016-06-02 Thread Ismael Juma
Hi Danny, A KIP has not been drafted for that yet. Would you be interested in working on it? Ismael On Thu, Jun 2, 2016 at 1:15 PM, Danny Bahir wrote: > Thanks Ben. > > The comments on the Jira mention a pluggable component that will manage > the bootstrap list from a

Re: Dynamic bootstrap.servers with multiple data centers

2016-06-02 Thread Danny Bahir
Thanks Ben. The comments on the Jira mention a pluggable component that will manage the bootstrap list from a discovery service. That's exactly what I need. Was a Kip drafted for this enhancement? -Danny > On Jun 1, 2016, at 7:05 AM, Ben Stopford wrote: > > Hey Danny >

Re: Best monitoring tool for Kafka in production

2016-06-02 Thread Michael Noll
Hafsa, since you specifically asked about non-free Kafka monitoring options as well: As of version 3.0.0, the Confluent Platform provides a commercial monitoring tool for Kafka called Confluent Control Center. (Disclaimer: I work for Confluent.) Quoting from the product page at

Re: Does the Kafka Streams DSL support non-Kafka sources/sinks?

2016-06-02 Thread Eno Thereska
Hi Avi, Using the low-level streams API you can definitely read or write to arbitrary locations inside the process() method. However, back to your original question: even with the low-level streams API the sources and sinks can only be Kafka topics for now. So, as Gwen mentioned, Connect

Re: Kafka encryption

2016-06-02 Thread Tom Crayford
Filesystem encryption is transparent to Kafka. You don't need to use SSL, but your encryption requirements may cause you to need SSL as well. With regards to compression, without adding at rest encryption to Kafka (which is a very major piece of work, one that for sure requires a KIP and has

Re: ClosedChannelException when trying to read from remote Kafka in AWS

2016-06-02 Thread Mudit Kumar
Glad to hear that you issue is fixed now! On 6/2/16, 2:11 PM, "Marco B." wrote: >Hi Mudit, > >Thanks a lot for your answer. > >However, today we have set "advertised.host.name" on each kafka instance to >the specific IP address of each node. For example, by default kafka