mapping between disk and partition

2015-03-05 Thread sunil kalva
Hi Can i map a specific partition to a different disk in a broker. And what is the general recommendations for disk to partition mapping for which that broker is leader. and also for replications that broker handles. -- SunilKalva

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
will do, Thanks! On Thu, Mar 5, 2015 at 3:35 PM, Gwen Shapira gshap...@cloudera.com wrote: Did you take a look at the quick-start guide? https://kafka.apache.org/082/quickstart.html It shows how to set up a single node, how to validate that its working and then how to set up multi-node

Re: Database Replication Question

2015-03-05 Thread Roger Hoover
Hi Jonathan, TCP will take care of re-ordering the packets. On Wed, Mar 4, 2015 at 6:05 PM, Jonathan Hodges hodg...@gmail.com wrote: Thanks James. This is really helpful. Another extreme edge case might be that the single producer is sending the database log changes and the network causes

Re: REST/Proxy Consumer access

2015-03-05 Thread Andrew Otto
BTW, Wikimedia uses varnishkafka to produce http requests to Kafka, and we are pretty happy with it. https://github.com/wikimedia/varnishkafka On Mar 5, 2015, at 13:09, Ewen Cheslack-Postava e...@confluent.io wrote: Yes, Confluent built a REST proxy that gives access to cluster metadata

Re: Set up kafka cluster

2015-03-05 Thread Gwen Shapira
Jay Kreps has a gist with step by step instructions for reproducing the benchmarks used by LinkedIn: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c And the blog with the results: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
Thank you Gwen, I also need the kafka cluster continue to provide message brokering service to a Storm cluster after the benchmarking. I am fairly new to cluster setups. So is there an instruction telling me how to set up the three-node kafka cluster before running benchmarking? That would be

Set up kafka cluster

2015-03-05 Thread Yuheng Du
Hi everyone, I am trying to set up a kafka cluster consisting of three machines. I wanna run a benchmarking program in them. Can anyone recommend a step by step tutorial/instruction of how I can do it? Thanks. best, Yuheng

Re: Database Replication Question

2015-03-05 Thread Jay Kreps
Hey Xiao, That's not quite right. Fsync is controlled by either a time based criteria (flush every 30 seconds) or a number of messages criteria. So if you set the number of messages to 1 the flush is synchronous with the write, which I think is what you are looking for. -Jay On Thu, Mar 5, 2015

Re: Set up kafka cluster

2015-03-05 Thread Gwen Shapira
Did you take a look at the quick-start guide? https://kafka.apache.org/082/quickstart.html It shows how to set up a single node, how to validate that its working and then how to set up multi-node cluster. Good luck! On Thu, Mar 5, 2015 at 12:30 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:

topics still showing up using list command after deletion

2015-03-05 Thread max square
Hi All, I am using Kafka version 0.8.2.1. Like mentioned in the documentation, I enabled the delete.kafka.topic property in the config, restarted the brokers and issued the delete command. Then, I tried listing the topics and the topics that I deleted still shows up in the list. However, if I try

batching causes replica out of sync

2015-03-05 Thread xiaoyu wang
Hi all, We previously have replica.max.lag.message set to 4000 and use sync producer to send data to kafka, one message at a time. With this, we don't see many unclean leader election. Recently, we switched to use sync producer and batch messages. After that, we see unclean leader election more

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Otis Gospodnetic
Roger, Consider using rsyslog with omkafka. rsyslog rocks! And it's pretty popular, too - http://blog.sematext.com/2014/10/06/top-5-most-popular-log-shippers/ Oh, and it's FAST - some numbers and charts with an older version from 1 year ago:

Re: Mirror maker end to end latency metric

2015-03-05 Thread tao xiao
Thanks Jon and Guangzhou for the info On Fri, Mar 6, 2015 at 1:10 AM, Jon Bringhurst jbringhu...@linkedin.com.invalid wrote: Hey Tao, Slides 27-30 on http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 has a diagram to visually show that Guozhang is talking

JMS to Kafka: Inbuilt JMSAdaptor/JMSProxy/JMSBridge (Client can speak JMS but hit Kafka)

2015-03-05 Thread Joshi, Rekha
Hi, Kafka is a great alternative to JMS, providing high performance, throughput as scalable, distributed pub sub/commit log service. However there always exist traditional systems running on JMS. Rather than rewriting, it would be great if we just had an inbuilt JMSAdaptor/JMSProxy/JMSBridge

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread tao xiao
The reason you need to use a.getBytes is because the default serializer.class is kafka.serializer.DefaultEncoder which takes byte[] as input. The way the array returns hash code is not based on equality of the elements hence every time a new byte array is created which is the case in your sample

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Thanks a lot, really appreciate you guys help!!! On Thursday, March 5, 2015 9:17 PM, tao xiao xiaotao...@gmail.com wrote: The reason you need to use a.getBytes is because the default serializer.class is kafka.serializer.DefaultEncoder which takes byte[] as input. The way the array

Re: TopicFilters and 0.9 Consumer

2015-03-05 Thread Guozhang Wang
Vinoth, Yes we do have plans to continue supporting topic filters in 0.9 consumers, the APIs are not there yet though. Guozhang On Thu, Mar 5, 2015 at 8:32 AM, Vinoth Chandar vin...@uber.com wrote: Hi guys, I was wondering what the plan in 0.9, was for the topic filters that are today in

Re: Database Replication Question

2015-03-05 Thread Guozhang Wang
Josh, Dedupping on the consumer side may be tricky as it requires some sequence number on the messages in order to achieve idempotency. On the other hand, we are planning to add idempotent producer or transactional messaging https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer

TopicFilters and 0.9 Consumer

2015-03-05 Thread Vinoth Chandar
Hi guys, I was wondering what the plan in 0.9, was for the topic filters that are today in the High level consumer. The new API' http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/org/apache/kafka/clients/consumer/KafkaConsumer.htmls subscribe methods, seem to be working with

Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker instance running and I'm experimenting kafka producer in a cluster environment. So I create a topic foo with 2 partitions and replication 1.I create a async Producer without defining partition.class (so the partitioner

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
And I'm using kafka version 0.8.2.0 On Thursday, March 5, 2015 11:51 AM, Zijing Guo alter...@yahoo.com.INVALID wrote: Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker instance running and I'm experimenting kafka producer in a cluster environment. So I

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Mayuresh Gharat
I suppose the keyedMessage constructor is KeyedMessage(topic, key, message), so in your case key is test message + e. Thanks, Mayuresh On Thu, Mar 5, 2015 at 9:25 AM, Zijing Guo alter...@yahoo.com.invalid wrote: And I'm using kafka version 0.8.2.0 On Thursday, March 5, 2015 11:51 AM,

Re: Topicmetadata response miss some partitions information sometimes

2015-03-05 Thread Mayuresh Gharat
Yeah, but that gives them all the partitions and does not differentiate between available vs unavailable right. Thanks, Mayuresh On Thu, Mar 5, 2015 at 9:14 AM, Guozhang Wang wangg...@gmail.com wrote: I think today people can get the available partitions by calling partitionsFor() API, and

REST/Proxy Consumer access

2015-03-05 Thread Julio Castillo
I read the description of the new Confluent Platform and it briefly describes some REST access to a producer and a consumer. Does this mean there is a new process(es) running (Jetty based)? This process integrates both the consumer and producer libraries? Thanks Julio Castillo NOTICE: This

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi, Thanks for your response. That's just my typo, I was meant to say  KeyedMessage(foo,a, test message + e). On Thursday, March 5, 2015 12:49 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: I suppose the keyedMessage constructor is KeyedMessage(topic, key, message), so in your

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Guozhang Wang
Zijing, Which version of Kafka client are you using? On Thu, Mar 5, 2015 at 8:50 AM, Zijing Guo alter...@yahoo.com.invalid wrote: Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker instance running and I'm experimenting kafka producer in a cluster environment. So I

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi Guozhang,I'm using kafka 0.8.2.0  Thanks On Thursday, March 5, 2015 12:57 PM, Guozhang Wang wangg...@gmail.com wrote: Zijing, Which version of Kafka client are you using? On Thu, Mar 5, 2015 at 8:50 AM, Zijing Guo alter...@yahoo.com.invalid wrote: Hi community,I have a 2 nodes

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Vineet Mishra
Hey Roger, As per your stats you have around 5k msg/s of size 42 bytes 5000msgs * 42 byte = 21 = ~ 205kbps while I am getting around 500 msgs of around 350 bytes. 500msgs * 350 = 175000 = ~ 170kbps Which is even collectively very degrading write throughput. It seems this rate of

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Roger Hoover
I think my test include some grok filters and file input so it's not necessarily bottlenecked on Kafka producer. On Thu, Mar 5, 2015 at 12:37 AM, Vineet Mishra clearmido...@gmail.com wrote: Hey Roger, As per your stats you have around 5k msg/s of size 42 bytes 5000msgs * 42 byte = 21 =

Re: Database Replication Question

2015-03-05 Thread Xiao
Hi, James, This design regarding the restart point has a few potential issues, I think. - The restart point is based on the messages that you last published. The message could be pruned. How large is your log.retention.hours? - If the Kafka message order is different from your log sequence,

Re: kafka monitoring

2015-03-05 Thread Vladimir Tretyakov
Hi Sa Li, For the monitoring piece there is SPM - see *http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/ http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/*. Demo https://apps.sematext.com/demo (just select 'SPM.Prod.Kafka' system after you login as DEMO user) It will monitor

Re: Database Replication Question

2015-03-05 Thread Xiao
Hey, Jay, Thank you for your answer! Based on my understanding, Kafka fsync is regularly issued by a dedicated helper thread. It is not issued based on the semantics. The producers are unable to issue a COMMIT to trigger fsync. Not sure if this requirement is highly desirable to the

Re: Mirror maker end to end latency metric

2015-03-05 Thread Guozhang Wang
There is no end2end latency metric in MM, since such a metric requires timestamp info on the source / dest Kafka clusters. For example, at LinkedIn we add a timestamp in the message header, and let a separate consumer to fetch the message on both ends to measure the latency. Guozhang On Wed, Mar

Re: REST/Proxy Consumer access

2015-03-05 Thread Ewen Cheslack-Postava
Yes, Confluent built a REST proxy that gives access to cluster metadata (e.g. list topics, leaders for partitions, etc), producer (send binary or Avro messages to any topic), and consumer (run a consumer instance and consume messages from a topic). And you are correct, internally it uses Jetty and

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
And also there something that I think worth mentioning,when I call  prod.send(KeyedMessage(foo, a, test message)), the data can't be delivered to the brokers, the only way to make it work is through:prod.send(KeyedMessage(foo, a.getBytes, test message.getBytes)). When I convert the data and key

Re: Database Replication Question

2015-03-05 Thread James Cheng
On Mar 5, 2015, at 12:59 AM, Xiao lixiao1...@gmail.com wrote: Hi, James, This design regarding the restart point has a few potential issues, I think. - The restart point is based on the messages that you last published. The message could be pruned. How large is your

RE: batching causes replica out of sync

2015-03-05 Thread Aditya Auradkar
Xiaoyu, Just FYI - Here's a discussion on this issue if you are interested. https://issues.apache.org/jira/browse/KAFKA-1546 Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, March 05, 2015 4:41 PM To: users@kafka.apache.org