Re: Message routing, Kafka-to-REST and HTTP API tools/frameworks for Kafka?
Hi, I think for 2) you can use Kafka Consumer and push messages to vertex event bus, which already have REST implementation (vertx-jersey). I would say, Vertx cluster can be used as receive data irrespective of topic and then publish to particular kafka topic. Then consume messages from kafka by different consumer and distribute. Kafka can hold messages without dropping at bursts and even at the time down stream slows down. Regards, Nageswara Rao On Wed, Mar 25, 2015 at 10:58 AM, Manoj Khangaonkar khangaon...@gmail.com wrote: Hi, For (1) and perhaps even for (2) where distribution/filtering on scale is required, I would look at using Apache Storm with kafka. For (3) , it seems you just need REST services wrapping kafka consumers/producers. I would start with usual suspects like jersey. regards On Tue, Mar 24, 2015 at 12:06 PM, Valentin kafka-...@sblk.de wrote: Hi guys, we have three Kafka use cases for which we have written our own PoC implementations, but where I am wondering whether there might be any fitting open source solution/tool/framework out there. Maybe someone of you has some ideas/pointers? :) 1) Message routing/distribution/filter tool We need to copy messages from a set of input topics to a set of output topics based on their message key values. Each message in an input topic will go to 0 to N output topics, each output topic will receive messages from 0 to N input topics. So basically the tool acts as a message routing component in our system. Example configuration: input topic A:output topic K:key value 1,key value 2,key value 3 input topic A:output topic L:key value 2,key value 4 input topic B:output topic K:key value 5,key value 6 ... It would also be interesting to define distribution/filter rules based on regular expressions on the message key or message body. 2) Kafka-to-REST Push service We need to consume messages from a set of topics, translate them into REST web service calls and forward the data to existing, non-Kafka-aware systems with REST APIs that way. 3) HTTP REST API for consumers and producers We need to expose the simple consumer and the producer functionalities via REST web service calls, with authentication and per-topic-authorization on REST API level and TLS for transport encryption. Offset tracking is done by the connected systems, not by the broker/zookeeper/REST API. We expect a high message volume in the future here, so performance would be a key concern. Greetings Valentin -- http://khangaonkar.blogspot.com/ -- Thanks, Nageswara Rao.V *The LORD reigns*
kafka.admin as separate module
Hello Apache Kafka community, I like that kafka-clients is now separate module, and has no scala dependency even. I'd like to propose that kafka.admin package gets published as separate module too. I'm writing some tests, and to be able to use kafka.admin tools/utils in them I have to bring in too large kafka module, with server stuff and all dependencies, like netty. Test framework happens to use netty too but different version - classpath hell. Any thoughts? Is proposal sound enough for a JIRA ticket? Kind regards, Stevo Slavic.
Re: lost messages -?
You can use kafka-console-consumer consuming the topic from the beginning *kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning* On Thu, Mar 26, 2015 at 12:17 AM, Victor L vlyamt...@gmail.com wrote: Can someone let me know how to dump contents of topics? I have producers sending messages to 3 brokers but about half of them don't seem to be consumed. I suppose they are getting stuck in queues but how can i figure out where? Thks, -- Regards, Tao
Re: kafka.admin as separate module
Hi Stevo, JFYI: we are working now on new centralized API for Admin commands. This will include: - Public API to perform TopicCommands (Create/Alter/Delete/List/Describe) - Out-of-box java client for Admin API: AdminClient - will be part of /clients - Interactive cli for admin commands The plan was to have all mentioned above in 0.9 and probably remove existing tools at that point. You can check for details: Confluence - https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations JIRA - https://issues.apache.org/jira/browse/KAFKA-1694 Mailing list - thread [DISCUSS] KIP-4 - Command line and centralized administrative operations Thanks, Andrii Biletskyi On Wed, Mar 25, 2015 at 5:26 PM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, I like that kafka-clients is now separate module, and has no scala dependency even. I'd like to propose that kafka.admin package gets published as separate module too. I'm writing some tests, and to be able to use kafka.admin tools/utils in them I have to bring in too large kafka module, with server stuff and all dependencies, like netty. Test framework happens to use netty too but different version - classpath hell. Any thoughts? Is proposal sound enough for a JIRA ticket? Kind regards, Stevo Slavic.
Re: kafka.admin as separate module
Yeah, that would be great having a separate admin package. +1. Thanks, Mayuresh On Wed, Mar 25, 2015 at 8:26 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, I like that kafka-clients is now separate module, and has no scala dependency even. I'd like to propose that kafka.admin package gets published as separate module too. I'm writing some tests, and to be able to use kafka.admin tools/utils in them I have to bring in too large kafka module, with server stuff and all dependencies, like netty. Test framework happens to use netty too but different version - classpath hell. Any thoughts? Is proposal sound enough for a JIRA ticket? Kind regards, Stevo Slavic. -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: lost messages -?
You can use the DumpLogSegment tool. Thanks, Mayuresh On Wed, Mar 25, 2015 at 9:17 AM, Victor L vlyamt...@gmail.com wrote: Can someone let me know how to dump contents of topics? I have producers sending messages to 3 brokers but about half of them don't seem to be consumed. I suppose they are getting stuck in queues but how can i figure out where? Thks, -- -Regards, Mayuresh R. Gharat (862) 250-7125
lost messages -?
Can someone let me know how to dump contents of topics? I have producers sending messages to 3 brokers but about half of them don't seem to be consumed. I suppose they are getting stuck in queues but how can i figure out where? Thks,
Re: Message routing, Kafka-to-REST and HTTP API tools/frameworks for Kafka?
For 3, Confluent wrote a REST proxy that's pretty comprehensive. See the docs: http://confluent.io/docs/current/kafka-rest/docs/intro.html and a blog post describing it + future directions: http://blog.confluent.io/2015/03/25/a-comprehensive-open-source-rest-proxy-for-kafka/ There are a few other REST proxies: https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-HTTPREST But I don't think any of them support everything you need yet -- specifically, security stuff isn't included in any of them yet. You could address this with some performance hit by putting another HTTP server like nginx in front of the proxies (locally with each instance) to get that control. For Confluent's proxy, we're also thinking about how to add security features since we'll need something to protect admin operations: https://github.com/confluentinc/kafka-rest/wiki/Project---Admin-APIs -Ewen On Tue, Mar 24, 2015 at 11:34 PM, Nagesh nageswara.r...@gmail.com wrote: Hi, I think for 2) you can use Kafka Consumer and push messages to vertex event bus, which already have REST implementation (vertx-jersey). I would say, Vertx cluster can be used as receive data irrespective of topic and then publish to particular kafka topic. Then consume messages from kafka by different consumer and distribute. Kafka can hold messages without dropping at bursts and even at the time down stream slows down. Regards, Nageswara Rao On Wed, Mar 25, 2015 at 10:58 AM, Manoj Khangaonkar khangaon...@gmail.com wrote: Hi, For (1) and perhaps even for (2) where distribution/filtering on scale is required, I would look at using Apache Storm with kafka. For (3) , it seems you just need REST services wrapping kafka consumers/producers. I would start with usual suspects like jersey. regards On Tue, Mar 24, 2015 at 12:06 PM, Valentin kafka-...@sblk.de wrote: Hi guys, we have three Kafka use cases for which we have written our own PoC implementations, but where I am wondering whether there might be any fitting open source solution/tool/framework out there. Maybe someone of you has some ideas/pointers? :) 1) Message routing/distribution/filter tool We need to copy messages from a set of input topics to a set of output topics based on their message key values. Each message in an input topic will go to 0 to N output topics, each output topic will receive messages from 0 to N input topics. So basically the tool acts as a message routing component in our system. Example configuration: input topic A:output topic K:key value 1,key value 2,key value 3 input topic A:output topic L:key value 2,key value 4 input topic B:output topic K:key value 5,key value 6 ... It would also be interesting to define distribution/filter rules based on regular expressions on the message key or message body. 2) Kafka-to-REST Push service We need to consume messages from a set of topics, translate them into REST web service calls and forward the data to existing, non-Kafka-aware systems with REST APIs that way. 3) HTTP REST API for consumers and producers We need to expose the simple consumer and the producer functionalities via REST web service calls, with authentication and per-topic-authorization on REST API level and TLS for transport encryption. Offset tracking is done by the connected systems, not by the broker/zookeeper/REST API. We expect a high message volume in the future here, so performance would be a key concern. Greetings Valentin -- http://khangaonkar.blogspot.com/ -- Thanks, Nageswara Rao.V *The LORD reigns* -- Thanks, Ewen
Re: kafka.admin as separate module
Ah, great, thanks for heads up Andrii! On Mar 25, 2015 5:39 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi Stevo, JFYI: we are working now on new centralized API for Admin commands. This will include: - Public API to perform TopicCommands (Create/Alter/Delete/List/Describe) - Out-of-box java client for Admin API: AdminClient - will be part of /clients - Interactive cli for admin commands The plan was to have all mentioned above in 0.9 and probably remove existing tools at that point. You can check for details: Confluence - https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations JIRA - https://issues.apache.org/jira/browse/KAFKA-1694 Mailing list - thread [DISCUSS] KIP-4 - Command line and centralized administrative operations Thanks, Andrii Biletskyi On Wed, Mar 25, 2015 at 5:26 PM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, I like that kafka-clients is now separate module, and has no scala dependency even. I'd like to propose that kafka.admin package gets published as separate module too. I'm writing some tests, and to be able to use kafka.admin tools/utils in them I have to bring in too large kafka module, with server stuff and all dependencies, like netty. Test framework happens to use netty too but different version - classpath hell. Any thoughts? Is proposal sound enough for a JIRA ticket? Kind regards, Stevo Slavic.
Re: lost messages -?
DumpLogSegments will give you output something like this : offset: 780613873770 isvalid: true payloadsize: 8055 magic: 1 compresscodec: GZIPCompressionCodec If this is what you want you can use the tool, to detect if the messages are getting to your brokers. Console-Consumer will output the messages for you. Thanks, Mayuresh On Wed, Mar 25, 2015 at 9:33 AM, tao xiao xiaotao...@gmail.com wrote: You can use kafka-console-consumer consuming the topic from the beginning *kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning* On Thu, Mar 26, 2015 at 12:17 AM, Victor L vlyamt...@gmail.com wrote: Can someone let me know how to dump contents of topics? I have producers sending messages to 3 brokers but about half of them don't seem to be consumed. I suppose they are getting stuck in queues but how can i figure out where? Thks, -- Regards, Tao -- -Regards, Mayuresh R. Gharat (862) 250-7125
Producer Behavior When one or more Brokers' Disk is Full.
Hello Kafka Community, What is expected behavior on Producer side when one or more Brokers’ disk is full, but have not reached retention period for topics (by size or by time limit). Does producer send data to that particular brokers and/or Producer Queue gets full and always throws Queue Full or based on configuration (I have producer with non-blocking setting when queue is full and ack are 0,1 and retries set to 3). What is expected behavior on OLD [Scala Based] vs Pure Java Based Producer ? Here is reference to past discussion: http://grokbase.com/t/kafka/users/147h4958k8/how-to-recover-from-a-disk-full-situation-in-kafka-cluster Is there wiki or cookbook steps to recover from such situation ? Thanks, Bhavesh
Re: Kafka server relocation
If you want to do a seamless migration. I think a better way is to build a cross datacenter Kafka cluster temporarily. So the process is: 1. Add several new Kafka brokers in your new datacenter and add them to the old cluster. 2. Use replica assignment tool to reassign all the partitions to brokers in new datacenter. 3. Perform controlled shutdown on the brokers in old datacenter. Jiangjie (Becket) Qin On 3/25/15, 2:01 PM, nitin sharma kumarsharma.ni...@gmail.com wrote: Hi Team, in my project, we have built a new datacenter for Kafka brokers and wants to migrate from current datacenter to new one. Switching producers and consumers wont be a problem provided New Datacenter has all the messages of existing Datacenter. i have only 1 topic with 2 partition that need to be migrated... that too it is only 1 time activity. Kindly suggest the best way to deal with this situation. Regards, Nitin Kumar Sharma.
Kafka server relocation
Hi Team, in my project, we have built a new datacenter for Kafka brokers and wants to migrate from current datacenter to new one. Switching producers and consumers wont be a problem provided New Datacenter has all the messages of existing Datacenter. i have only 1 topic with 2 partition that need to be migrated... that too it is only 1 time activity. Kindly suggest the best way to deal with this situation. Regards, Nitin Kumar Sharma.
Re: Kafka server relocation
You can use the Mirror maker to move data from one data center to other and once all the data has been moved you can shut down the source data center by doing a controlled shutdown. Thanks, Mayuresh On Wed, Mar 25, 2015 at 2:35 PM, Jiangjie Qin j...@linkedin.com.invalid wrote: If you want to do a seamless migration. I think a better way is to build a cross datacenter Kafka cluster temporarily. So the process is: 1. Add several new Kafka brokers in your new datacenter and add them to the old cluster. 2. Use replica assignment tool to reassign all the partitions to brokers in new datacenter. 3. Perform controlled shutdown on the brokers in old datacenter. Jiangjie (Becket) Qin On 3/25/15, 2:01 PM, nitin sharma kumarsharma.ni...@gmail.com wrote: Hi Team, in my project, we have built a new datacenter for Kafka brokers and wants to migrate from current datacenter to new one. Switching producers and consumers wont be a problem provided New Datacenter has all the messages of existing Datacenter. i have only 1 topic with 2 partition that need to be migrated... that too it is only 1 time activity. Kindly suggest the best way to deal with this situation. Regards, Nitin Kumar Sharma. -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Get replication and partition count of a topic
Ewen Cheslack-Postava ewen@... writes: Im also searching the shortest way to find topic partition count, so that the initialization code in the thread pool can set up the right number of threads. so far i found below is the shortest way. public static void main(String[] args){ int sessionTimeoutMs = 1; int connectionTimeoutMs = 1; ZkClient zkClient = new ZkClient(localhost:2181, sessionTimeoutMs, connectionTimeoutMs); TopicMetadata metaData = AdminUtils.fetchTopicMetadataFromZk(forthtopic,zkClient); System.out.println(metaData.partitionsMetadata().size()); } kindly reply if you find any other shortest way.