Question regarding functionality of MirrorMaker

2016-08-25 Thread UMESH CHAUDHARY
Hey Folks, I was trying to understand the behavior of MirrorMaker but looks like I am missing something here. Please see the steps which I performed : 1) I configured MM on source Kafka cluster 2) Created a topic and pushed some data in it using console producer. 3) My understanding is that MM

kafka-mirror-maker.sh ssl

2016-08-25 Thread Erik Parienty
As I understand mirror-maker support consumer ssl I tried to set it but I get WARN Property security.protocol is not valid (kafka.utils.VerifiableProperties) And its connecting without ssl

Re: Kafka streams

2016-08-25 Thread Abhishek Agarwal
Hi Eno, Thanks for your reply. If my application DAG has three stream processors, first of which is source, would all of them run in single thread? There may be scenarios wherein I want to have different number of threads for different processors since some may be CPU bound and some may be IO

Re: Kafka Producer performance - 400GB of transfer on single instance taking > 72 hours?

2016-08-25 Thread Dominik Safaric
Dear Dana, > I would recommend > other tools for bulk transfers. What tools/languages would you rather recommend then using Python? I could for sure accomplish the same by using the native Java Kafka Producer API, but should this really affect the performance under the assumption that the

Re: Kafka streams

2016-08-25 Thread Eno Thereska
Good question. All of them would run in a single thread. That is the model. Multiple threads would make sense to run separate DAGs. Eno > On 25 Aug 2016, at 18:32, Abhishek Agarwal wrote: > > Hi Eno, > > Thanks for your reply. If my application DAG has three stream

RE: Kafka Producer performance - 400GB of transfer on single instance taking > 72 hours?

2016-08-25 Thread Tauzell, Dave
I would write a python client that writes dummy data to kafka to measure how fast you can write to Kafka without MongoDB in the mix. I've been doing load testing recently can with 3 brokers I can write 100MB/s (using Java clients). -Dave -Original Message- From: Dominik Safaric

Kafka streams

2016-08-25 Thread Abhishek Agarwal
Hi, I was reading up on kafka streams for a project and came across this blog https://softwaremill.com/kafka-streams-how-does-it-fit-stream-landscape/ I wanted to validate some assertions made in blog, with kafka community - Kafka streams is kafka-in, kafka-out application. Does the user need

Re: Kafka streams

2016-08-25 Thread Eno Thereska
Hi Abhishek, - Correct on connecting to external stores. You can use Kafka Connect to get things in or out. (Note that in the 0.10.1 release KIP-67 allows you to directly query Kafka Stream's stores so, for some kind of data you don't need to move it to an external store. This is pushed in

Re: Networking errors and durability settings

2016-08-25 Thread Jun Rao
Bryan, https://issues.apache.org/jira/browse/KAFKA-3410 reported a similar issue but only happened when the leader broker's log was manually deleted. In your case, was there any data loss in the broker due to things like power outage? Thanks, Jun On Tue, Aug 23, 2016 at 9:00 AM, Bryan Baugher

Production Use Cases

2016-08-25 Thread Tauzell, Dave
Does anybody do the following in production? If so, what are your experiences? 1. Use .Net applications for producers or consumers 2. Consume messages across the WAN (across datacenters) - I'm wondering if MirrorMaker is always a requirement for cross-WAN -Dave This e-mail and any

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Jun Rao
Jan, Thanks a lot for the feedback. Now I understood your concern better. The following are my comments. The first odd thing that you pointed out could be a real concern. Basically, if a producer publishes messages with really old timestamp, our default log.roll.hours (7 days) will indeed cause

Kafka 0.8.2.2 - CLOSE_WAITS on broker

2016-08-25 Thread Bharath Srinivasan
Hello: We are running a data pipeline application stack using Kafka 0.8.2.2 in production. We have been seeing intermittent CLOSE_WAIT on our kafka brokers frequently and they fill up the file handles pretty quickly. By the time the open file count reaches around 40K, the node becomes

Re: Kafka Producer performance - 400GB of transfer on single instance taking > 72 hours?

2016-08-25 Thread Dana Powers
kafka-python includes some benchmarking scripts in https://github.com/dpkp/kafka-python/tree/master/benchmarks The concurrency and execution model of the JVM are both significantly different than python. I would definitely recommend some background reading on CPython GIL if you are interested on

Broker disappear from zookeeper

2016-08-25 Thread 李斯宁
hi, guys I have a three nodes kafka cluster. Some times, one of the Kafka brokers may disappear from zookeeper "/brokers/ids", but the process is still live and prints logs seems normal. Have anyone of you also saw this problem? Is this a bug? I suspect when session expired, the kafka broker do

Re: oom of kafka

2016-08-25 Thread 黄川
is there anybody can help me? 2016-08-23 8:49 GMT+08:00 黄川 : > Hi, I am using kafka_2.11-0.9.0.1, there are multiple of warnings occur > in kafkaServer.out: > java.lang.OutOfMemoryError: Java heap space. > > >- the jstat output like this: > > # jstat -gc 28591 > S0C

Re: Kafka streams

2016-08-25 Thread Matthias J. Sax
Just want to add something: I you use Kafka Streams DSL, the library is Kafka centric. However, you could use low-level Processor API to get data into your topology from other systems. The problem will be missing fault-tolerance that you would need to code by yourself. When reading from Kafka,

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Becket Qin
Hi Jan, It seems your main concern is for the changed behavior of time based log rolling and time based retention. That is actually why we have two timestamp types. If user set the log.message.timestamp.type to LogAppendTime, the broker will behave exactly the same as they were, except the

Joining Streams with Kafka Streams

2016-08-25 Thread Caleb Welton
Hello, I'm trying to understand best practices related to joining streams using the Kafka Streams API. I can configure the topology such that two sources feed into a single processor: topologyBuilder .addSource("A", stringDeserializer, itemDeserializer, "a-topic") .addSource("B",

Re: Networking errors and durability settings

2016-08-25 Thread Guozhang Wang
Hello Bryan, I think you were encountering https://issues.apache.org/jira/browse/KAFKA-3410. Maybe you can take a look on this ticket and see if it matches your scenario. Guozhang On Tue, Aug 23, 2016 at 9:00 AM, Bryan Baugher wrote: > Hi everyone, > > Yesterday we had lots

consumer with version 0.10.0

2016-08-25 Thread Jack Yang
Hi all, I am using kafka 0.10.0.1, and I set up my listeners like: listeners=PLAINTEXT://myhostName:9092 then I have one consumer going using the new api. However, I did not see anything return for the api. The log from kafka is: [2016-08-26 14:39:28,548] INFO [GroupCoordinator 0]: Preparing

Re: Question regarding functionality of MirrorMaker

2016-08-25 Thread UMESH CHAUDHARY
Hello Mate, Thanks for your detailed response and it surely helps. WhiteList is the required config for MM from 0.9.0 onwards. And you are correct that --new-consumer requires --bootstrap-servers rather than --zookeeper . However, did you notice that MM picks the topics which are present at the