Re: kafka shutdown automatically

2015-01-20 Thread Yonghui Zhao
0.8.1.1 we have 2 brokers. broker-0 and broker-1 2015-01-20 8:43 GMT+08:00 Guozhang Wang wangg...@gmail.com: Yonghui, which version of Kafka are you using? And does your cluster only have one (broker-0) server? Guozhang On Sat, Jan 17, 2015 at 11:53 PM, Yonghui Zhao zhaoyong...@gmail.com

typo in wiki

2015-01-20 Thread svante karlsson
In the wiki - there is a statement that a partition must fit on a single machine, while technically true, isn't it so that a partition must fit on a single disk on that machine. https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowmanytopicscanIhave ? A partition is basically a

How to handle broker disk failure

2015-01-20 Thread svante karlsson
I'm trying to figure out the best way to handle a disk failure in a live environment. The obvious (and naive) solution is to decommission the broker and let other brokers taker over and create new followers. Then replace the disk and clean the remaining log directories and add the broker again.

RE: Backups

2015-01-20 Thread Gene Robichaux
Thanks for the feedback. Our DEV team has built a MirrorMaker-like process that mirrors all topics between two DCs. It is basically separate consumer/producer process that shovels data from DC A to DC B between two separate Kafka clusters.So in essence we have a replication factor of 6 (3

Re: How to handle broker disk failure

2015-01-20 Thread Yang Fang
I think the best way is raid not jbod. If one disk of jbod goes wrong , broker shutdown, then it takes long time to recovery . Brokes which run for long time will be more and more leaders of partitions. I/O pressure will be unbalanced. btw, I use kafka 0.8.0-beta1

Re: Backups

2015-01-20 Thread Otis Gospodnetic
Could one use ZFS or BTRFS snapshot functionality for this? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Jan 20, 2015 at 1:39 AM, Gwen Shapira gshap...@cloudera.com wrote: Hi, As a former DBA, I

Re: Backups

2015-01-20 Thread Gwen Shapira
Nice idea! Console consumer won't necessarily work, since it doesn't parallelize, but using something like Camus to backup to HDFS can be pretty cool. On Tue, Jan 20, 2015 at 10:15 AM, Jayesh Thakrar j_thak...@yahoo.com.invalid wrote: Another option is to copy data from each topic (of

RE: Backups

2015-01-20 Thread Gene Robichaux
We are using a 7 day retention. I like the idea of using a console-consumer to dump the topics nightly, but with a replication factor of 3 in each DC I am not sure it is even worth the trouble. We would have to have a pretty tragic event to take out all 4 of 10 servers. Could I have a

Updated Logstash-Kafka plugin

2015-01-20 Thread Joseph Lawson
Hello Everyone, As we wait for the 1.5 release of Logstash which will include the Kafka plugin by default I've put some work into the logstash-kafka plugin which makes it much easier to install and also enables logging on from the kafka library when you use --debug or --verbose toggles with

Re: Number of Consumers Connected

2015-01-20 Thread Sa Li
Guozhang Thank you very much for reply, here I print out the kafka-console-consumer.sh help, root@voluminous-mass:/srv/kafka# bin/kafka-console-consumer.sh Missing required argument [zookeeper] Option Description --

Re: Backups

2015-01-20 Thread Gwen Shapira
Interesting question. I think you'll need to sync it for the exact same time across the entire cluster, otherwise you'll recover from an inconsistent state. Not sure if this is feasible, or how Kafka handles starting from inconsistent state. If I were the sysadmin, I'd go with the good old MySQL

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Su She
Ahh okay got it. Thank you Magnus! On Tue, Jan 20, 2015 at 1:55 PM, Magnus Edenhill mag...@edenhill.se wrote: If you pipe your java program to kafkacat then the java program's stdout (output) will be directed to kafkacat's stdin (input). java program | kafkacat 2015-01-20 22:48

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Su She
Ahh yea that is what I was looking for, thank you! I only need it on the producer side, are there any other options for that? Thanks! On Tue, Jan 20, 2015 at 12:46 PM, Joe Stein joe.st...@stealth.ly wrote: This is a stdin/stdout producer/consumer that works great for (what I think) you are

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Magnus Edenhill
If you pipe your java program to kafkacat then the java program's stdout (output) will be directed to kafkacat's stdin (input). java program | kafkacat 2015-01-20 22:48 GMT+01:00 Su She suhsheka...@gmail.com: Thanks Magnus, One clarification...If I have a java program outputting messages,

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Magnus Edenhill
Hi Su, It is as simple as piping to it: some_program_outputting_the_stuff | kafkacat -b mybroker -t mytopic That will produce one kafka message to topic 'mytopic' for each line of input. See kafkacat -h for the full range of options (e.g., delimiter, specific partition, etc). Hope that helps,

Re: How to run Kafka Producer in Java environment? How to set mainClass in pom file in EC2 instance?

2015-01-20 Thread Ewen Cheslack-Postava
You should only need jar.with.dependencies.jar -- maven-assembly-plugin's jar-with-dependencies mode collects all your code and project dependencies into one jar file. It looks like the problem is that your mainclass is set to only 'HelloKafkaProducer'. You need to specify the full name

Re: Poll: Producer/Consumer impl/language you use?

2015-01-20 Thread Koert Kuipers
no scala? although scala can indeed use the java api, its ugly we prefer to use the scala api (which i believe will go away unfortunately) On Tue, Jan 20, 2015 at 2:52 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, I was wondering which implementations/languages people use

Re: How to run Kafka Producer in Java environment? How to set mainClass in pom file in EC2 instance?

2015-01-20 Thread Su She
Hello Ewen, Thanks for the response. I am running this on an EC2 instance, instead of com.spnotes...i tried the path like home.ec2-user...HelloKafkaProducer, but that did not work. Update/Edit: Fixed it...I had to move my java file to src/main/java and then do the mvn clean install. It seemed

Re: Jar files needed to run in Java environment (without Maven)

2015-01-20 Thread Jun Rao
The following are the jars you need. kafka-clients-0.8.2.0.jar scala-library-2.10.4.jar zookeeper-3.4.6.jar zkclient-0.3.jar metrics-core-2.2.0.jar jopt-simple-3.2.jar slf4j-api-1.7.6.jar snappy-java-1.1.1.6.jar lz4-1.2.0.jar slf4j-log4j12-1.6.1.jar log4j-1.2.16.jar kafka_2.10-0.8.2.0.jar Thanks,

Re: Updated Logstash-Kafka plugin

2015-01-20 Thread Jun Rao
Joe, Thanks for sharing. Do you want to add a link to our wiki ( https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem)? Jun On Tue, Jan 20, 2015 at 7:59 AM, Joseph Lawson jlaw...@roomkey.com wrote: Hello Everyone, As we wait for the 1.5 release of Logstash which will include the

Re: dumping JMX data

2015-01-20 Thread Jaikiran Pai
Just had a quick look at this and it turns out the object name you are passing is incorrect. I had to change it to: ./kafka-run-class.sh kafka.tools.JmxTool --object-name 'kafka.server:name=UnderReplicadPartitions,type=ReplicaManager' --jmx-url

Re: Number of Consumers Connected

2015-01-20 Thread Guozhang Wang
It seems not the latest version of Kafka, which version are you using? On Tue, Jan 20, 2015 at 9:46 AM, Sa Li sal...@gmail.com wrote: Guozhang Thank you very much for reply, here I print out the kafka-console-consumer.sh help, root@voluminous-mass:/srv/kafka# bin/kafka-console-consumer.sh

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Su She
Thanks Magnus, One clarification...If I have a java program outputting messages, that is stdout, can the producer use stdout or only stdin? Thanks, Su On Tue, Jan 20, 2015 at 1:10 PM, Magnus Edenhill mag...@edenhill.se wrote: Hi Su, It is as simple as piping to it:

Re: kafka brokers going down within 24 hrs

2015-01-20 Thread Tousif
any help ? On Mon, Jan 19, 2015 at 11:43 AM, Tousif tousif.pa...@gmail.com wrote: Here are the logs from broker id 0 and 1 and it was captured when broker 1 went down. http://paste.ubuntu.com/9782553/ http://paste.ubuntu.com/9782554/ i'm using netty in storm and here are the configs

Re: connection error among nodes

2015-01-20 Thread Jun Rao
1. bootstrap.servers is only used for issuing the very first metadata request. Once the leader of a topic/partition is found, the producer will talk to the leader brokers directly. 2. Do you see any disconnects in the broker log? Thanks, Jun On Mon, Jan 19, 2015 at 11:05 AM, Sa Li

Re: How to handle broker disk failure

2015-01-20 Thread Jun Rao
Actually, you don't need to reassign partitions in this case. You just need to replace the bad disk and restart the broker. It will copy the missing data over automatically. Thanks, Jun On Tue, Jan 20, 2015 at 1:02 AM, svante karlsson s...@csi.se wrote: I'm trying to figure out the best way

Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Su She
Hello Everyone, Sorry for asking multiple questions, but I am currently trying another approach to run a kafka producer. 1) I started the kafka console producer as mentioned here in the background (just added a to the end of the producer script) :

Re: Can I run Kafka Producer in BG and then run a program that outputs to the console?

2015-01-20 Thread Joe Stein
This is a stdin/stdout producer/consumer that works great for (what I think) you are trying to-do https://github.com/edenhill/kafkacat /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter:

Re: How to handle broker disk failure

2015-01-20 Thread Koert Kuipers
i think it would be nice if the recommended setup for kafka is jbod and not raid because: * it makes it easy to test kafka on an existing hadoop/spark cluster * co-location, for example we colocate kafka and spark streaming (our spark streaming app is kafka partition location aware) ideally kafka