Re: Kafka server relocation

2015-04-07 Thread tao xiao
You may need to look into the consumer metrics and producer metrics to identify the root cause. metrics in kafka.consumer and kafka.producer categories will help you find out the problems. This link gives instruction how to read the metrics http://kafka.apache.org/documentation.html#monitoring O

Request for adding us to the "Powered By" list

2015-04-07 Thread Anuj Goyal
Dear Kafka team, Could you add us at https://cwiki.apache.org/confluence/display/KAFKA/Powered+By. Here is the blurb: *IFTTT (www.ifttt.com ) - We use Kafka to ingest real-time log and tracking data for analytics, dashboards, and machine learning.*

Re: question about Kafka

2015-04-07 Thread Jiangjie Qin
Yes, you might need to write some code to read from the log end and send it to Kafka using Kafka’s producer. On 4/6/15, 2:39 PM, "Sun, Joey" wrote: >Thanks for your info, Becket. > >Does it mean I should program for it? is there any other app can >gracefully glue access_log to Kafka's producer?

Re: Is there a complete Kafka 0.8.* replication design document

2015-04-07 Thread Jun Rao
Yes, the wiki is a bit old. You can find out more about replication in the following links. http://kafka.apache.org/documentation.html#replication http://www.slideshare.net/junrao/kafka-replication-apachecon2013 #1, #2, #8. See the ZK layout in https://cwiki.apache.org/confluence/display/KAFKA/Kaf

Re: Number of Partitions and Performance

2015-04-07 Thread François Méthot
Thanks guys for the clarification about the "rule of thumb formula", I will stick with a reasonably small set of partitions but add a few to make them a multiple of the number of brokers. Todd, I read your post yesterday as well, very helpful. On Tue, Apr 7, 2015 at 1:42 PM, Todd Palino wrote:

Re: Kafka server relocation

2015-04-07 Thread nitin sharma
hi, sorry for late response. ... i have been able to fix the issue .. problem was in my approach. I got confused between my source and target system while defining consumer & producer property file .. it is fixed now Now new issue.. the rate at which data is migrated is very very slow... i mean i

Re: Problem with node after restart no partitions?

2015-04-07 Thread Jason Rosenberg
Thunder, thanks for the detailed info. I can confirm that our incident had the same (or similar) sequence of messages, when the first upgraded broker restarted (after having undergone an unclean shutdown). I think it makes sense at this point, to file a jira issue to track it. (Could mostly just

Re: Kafka question

2015-04-07 Thread Jack
That would be really useful. Thanks for your writing, Guozhang. I will give it a shot and let you know. On Tue, Apr 7, 2015 at 10:06 AM, Guozhang Wang wrote: > Jack, > > Okay I see your point now. I was originally thinking that in each run, you > 1) first create the topic, 2) start producing to

Re: Number of Partitions and Performance

2015-04-07 Thread Todd Palino
Going to stand with Jay here :) I just posted an email yesterday about how we size clusters and topics. Basically, have at least as many partitions as you have consumers in your consumer group (preferably a multiple). If you want to balance it across the cluster, also have it be a multiple of the

Re: Number of Partitions and Performance

2015-04-07 Thread Jay Kreps
I think the blog post was giving that as an upper bound not a recommended size. I think that blog goes through some of the trade offs of having more or fewer partitions. -Jay On Tue, Apr 7, 2015 at 10:13 AM, François Méthot wrote: > Hi, > > We initially had configured our topics to have betwe

Number of Partitions and Performance

2015-04-07 Thread François Méthot
Hi, We initially had configured our topics to have between 8 to 16 partitions each on a cluster of 10 brokers (vm with 2 cores, 16 MB ram, Few TB of SAN Disk). Then I came across the rule of thump formula *100 x b x r.* ( http://blog.confluent.io/2015/03/12/how-to-choose-the-number-of-topicspar

Re: Kafka question

2015-04-07 Thread Guozhang Wang
Jack, Okay I see your point now. I was originally thinking that in each run, you 1) first create the topic, 2) start producing to the topic, 3) start consuming from the topic, and then 4) delete the topic, stop producers / consumers before complete, but it sounds like you actually only create the

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Andrei
Thanks a lot, ZkStringSerializer works like a charm! For those googling for the same question, here's a gist, which instantiates ZkClient and sets proper serializer (in case somebody else finds this question). [1]: https://gist.github.com/jjkoshy/3842975 On Tue, Apr 7, 2015 at 6:49 PM, Guozhang

Re: New kafka client for Go (golang)

2015-04-07 Thread Piotr Husiatyński
Sorry if this mail is not send properly, but I have no idea how to send in gmail response to mailing list without having original message in mailbox. > How does it compare to Sarama? There are several important differences. It's been about two months since I looked at sarama API and I know that

Re: Consumer Group Lag Reporting

2015-04-07 Thread Kyle Banker
Thanks, Otis. I actually already have a reporting and alerting infrastructure. I mainly wanted to confirm that parsing the output of the offset checker is the recommended practice for reporting consumer group offsets. Is this the case? If so, I wanted to find out if any work is under way to make t

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Guozhang Wang
Andrei, Kafka uses string serialization when writing data to ZK, you can find its implementation in kafka.utils.ZKStringSerializer. Guozhang On Tue, Apr 7, 2015 at 6:40 AM, Andrei wrote: > I'm trying to read data from ZooKeeper nodes that was written by different > Kafka components. As a speci

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Patrick Dignan
You need to create the ZKClient with the kafka.utils.ZkStringSerializer as the serializer. On Tue, Apr 7, 2015 at 9:40 AM, Andrei wrote: > I'm trying to read data from ZooKeeper nodes that was written by different > Kafka components. As a specific example (just one from a bunch), I'm trying > to

Re: Increasing replication factor of existing topics

2015-04-07 Thread Harsha
Hi Navneet,           Any reason that you are looking to modify the zk nodes directly to increase the topic partition. If you are looking for an api to do this there is AdminUtils.addPartitions .  --  Harsha On April 7, 2015 at 6:45:40 AM, Navneet Gupta (Tech - BLR) (navneet.gu...@flipkart.co

Empty topic metadata returned during/shortly after server startup

2015-04-07 Thread David Corley
Hey all, We're trying to write some integration tests around a Ruby-based Kafka client we're developing that leverages both poseidon and poseidon_cluster gems. We're running Kafka 0.8.0 in a single node config with a single ZK instance supporting it on the same machine. The basic tests is as follo

Re: Increasing replication factor of existing topics

2015-04-07 Thread Todd Palino
The partition reassignment is started by writing a zookeeper node in the admin tree. While it's possible to kick off the partition reassignment by writing the zookeeper node that controls it directly, you have to be very careful about doing this, making sure that the format is perfect and you perfo

Increasing replication factor of existing topics

2015-04-07 Thread Navneet Gupta (Tech - BLR)
Hi, I got a method to increase replication factor of topics here However, I was wondering if it's possible to do it by altering some nodes in zookeeper. Thoughts/suggestions welcome. -- Thanks & Regards, Navneet Gupta

What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Andrei
I'm trying to read data from ZooKeeper nodes that was written by different Kafka components. As a specific example (just one from a bunch), I'm trying to read current offset for specific group, topic and partition. As far as I understand, it is stored under the path /consumers/data-processing-