Re: Message routing, Kafka-to-REST and HTTP API tools/frameworks for Kafka?

2015-03-25 Thread Nagesh
Hi,

I think for 2) you can use Kafka Consumer and push messages to vertex event
bus, which already have REST implementation (vertx-jersey).

I would say, Vertx cluster can be used as receive data irrespective of
topic and then publish to particular kafka topic. Then consume messages
from kafka by different consumer and distribute. Kafka can hold messages
without dropping at bursts and even at the time down stream slows down.

Regards,
Nageswara Rao

On Wed, Mar 25, 2015 at 10:58 AM, Manoj Khangaonkar khangaon...@gmail.com
wrote:

 Hi,

 For (1) and perhaps even for (2) where distribution/filtering on scale is
 required, I would look at using Apache Storm with kafka.

 For (3) , it seems you just need REST services wrapping kafka
 consumers/producers. I would start with usual suspects like jersey.

 regards

 On Tue, Mar 24, 2015 at 12:06 PM, Valentin kafka-...@sblk.de wrote:

 
  Hi guys,
 
  we have three Kafka use cases for which we have written our own PoC
  implementations,
  but where I am wondering whether there might be any fitting open source
  solution/tool/framework out there.
  Maybe someone of you has some ideas/pointers? :)
 
  1) Message routing/distribution/filter tool
  We need to copy messages from a set of input topics to a set of output
  topics
  based on their message key values. Each message in an input topic will go
  to 0 to N output topics,
  each output topic will receive messages from 0 to N input topics.
  So basically the tool acts as a message routing component in our system.
  Example configuration:
  input topic A:output topic K:key value 1,key value 2,key value
 3
  input topic A:output topic L:key value 2,key value 4
  input topic B:output topic K:key value 5,key value 6
  ...
  It would also be interesting to define distribution/filter rules based on
  regular expressions on the message key or message body.
 
  2) Kafka-to-REST Push service
  We need to consume messages from a set of topics, translate them into
 REST
  web service calls
  and forward the data to existing, non-Kafka-aware systems with REST APIs
  that way.
 
  3) HTTP REST API for consumers and producers
  We need to expose the simple consumer and the producer functionalities
 via
  REST web service calls,
  with authentication and per-topic-authorization on REST API level and TLS
  for transport encryption.
  Offset tracking is done by the connected systems, not by the
  broker/zookeeper/REST API.
  We expect a high message volume in the future here, so performance would
  be a key concern.
 
  Greetings
  Valentin
 



 --
 http://khangaonkar.blogspot.com/




-- 
Thanks,
Nageswara Rao.V

*The LORD reigns*


kafka.admin as separate module

2015-03-25 Thread Stevo Slavić
Hello Apache Kafka community,

I like that kafka-clients is now separate module, and has no scala
dependency even. I'd like to propose that kafka.admin package gets
published as separate module too.

I'm writing some tests, and to be able to use kafka.admin tools/utils in
them I have to bring in too large kafka module, with server stuff and all
dependencies, like netty. Test framework happens to use netty too but
different version - classpath hell.

Any thoughts? Is proposal sound enough for a JIRA ticket?

Kind regards,
Stevo Slavic.


Re: lost messages -?

2015-03-25 Thread tao xiao
You can use kafka-console-consumer consuming the topic from the beginning

*kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
--from-beginning*


On Thu, Mar 26, 2015 at 12:17 AM, Victor L vlyamt...@gmail.com wrote:

 Can someone let me know how to dump contents of topics?
 I have producers sending messages to 3 brokers but about half of them don't
 seem to be consumed. I suppose they are getting stuck in queues but how can
 i figure out where?
 Thks,




-- 
Regards,
Tao


Re: kafka.admin as separate module

2015-03-25 Thread Andrii Biletskyi
Hi Stevo,

JFYI: we are working now on new centralized API for Admin commands.
This will include:
- Public API to perform TopicCommands (Create/Alter/Delete/List/Describe)
- Out-of-box java client for Admin API: AdminClient - will be part of
/clients
- Interactive cli for admin commands
The plan was to have all mentioned above in 0.9 and probably remove existing
tools at that point.

You can check for details:
Confluence -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
JIRA - https://issues.apache.org/jira/browse/KAFKA-1694
Mailing list - thread [DISCUSS] KIP-4 - Command line and centralized
administrative operations

Thanks,
Andrii Biletskyi


On Wed, Mar 25, 2015 at 5:26 PM, Stevo Slavić ssla...@gmail.com wrote:

 Hello Apache Kafka community,

 I like that kafka-clients is now separate module, and has no scala
 dependency even. I'd like to propose that kafka.admin package gets
 published as separate module too.

 I'm writing some tests, and to be able to use kafka.admin tools/utils in
 them I have to bring in too large kafka module, with server stuff and all
 dependencies, like netty. Test framework happens to use netty too but
 different version - classpath hell.

 Any thoughts? Is proposal sound enough for a JIRA ticket?

 Kind regards,
 Stevo Slavic.



Re: kafka.admin as separate module

2015-03-25 Thread Mayuresh Gharat
Yeah, that would be great having a separate admin package. +1.

Thanks,

Mayuresh

On Wed, Mar 25, 2015 at 8:26 AM, Stevo Slavić ssla...@gmail.com wrote:

 Hello Apache Kafka community,

 I like that kafka-clients is now separate module, and has no scala
 dependency even. I'd like to propose that kafka.admin package gets
 published as separate module too.

 I'm writing some tests, and to be able to use kafka.admin tools/utils in
 them I have to bring in too large kafka module, with server stuff and all
 dependencies, like netty. Test framework happens to use netty too but
 different version - classpath hell.

 Any thoughts? Is proposal sound enough for a JIRA ticket?

 Kind regards,
 Stevo Slavic.




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: lost messages -?

2015-03-25 Thread Mayuresh Gharat
You can use the DumpLogSegment tool.

Thanks,

Mayuresh

On Wed, Mar 25, 2015 at 9:17 AM, Victor L vlyamt...@gmail.com wrote:

 Can someone let me know how to dump contents of topics?
 I have producers sending messages to 3 brokers but about half of them don't
 seem to be consumed. I suppose they are getting stuck in queues but how can
 i figure out where?
 Thks,




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


lost messages -?

2015-03-25 Thread Victor L
Can someone let me know how to dump contents of topics?
I have producers sending messages to 3 brokers but about half of them don't
seem to be consumed. I suppose they are getting stuck in queues but how can
i figure out where?
Thks,


Re: Message routing, Kafka-to-REST and HTTP API tools/frameworks for Kafka?

2015-03-25 Thread Ewen Cheslack-Postava
For 3, Confluent wrote a REST proxy that's pretty comprehensive. See the
docs: http://confluent.io/docs/current/kafka-rest/docs/intro.html and a
blog post describing it + future directions:
http://blog.confluent.io/2015/03/25/a-comprehensive-open-source-rest-proxy-for-kafka/

There are a few other REST proxies:
https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-HTTPREST
But I don't think any of them support everything you need yet --
specifically, security stuff isn't included in any of them yet. You could
address this with some performance hit by putting another HTTP server like
nginx in front of the proxies (locally with each instance) to get that
control. For Confluent's proxy, we're also thinking about how to add
security features since we'll need something to protect admin operations:
https://github.com/confluentinc/kafka-rest/wiki/Project---Admin-APIs

-Ewen

On Tue, Mar 24, 2015 at 11:34 PM, Nagesh nageswara.r...@gmail.com wrote:

 Hi,

 I think for 2) you can use Kafka Consumer and push messages to vertex event
 bus, which already have REST implementation (vertx-jersey).

 I would say, Vertx cluster can be used as receive data irrespective of
 topic and then publish to particular kafka topic. Then consume messages
 from kafka by different consumer and distribute. Kafka can hold messages
 without dropping at bursts and even at the time down stream slows down.

 Regards,
 Nageswara Rao

 On Wed, Mar 25, 2015 at 10:58 AM, Manoj Khangaonkar khangaon...@gmail.com
 
 wrote:

  Hi,
 
  For (1) and perhaps even for (2) where distribution/filtering on scale is
  required, I would look at using Apache Storm with kafka.
 
  For (3) , it seems you just need REST services wrapping kafka
  consumers/producers. I would start with usual suspects like jersey.
 
  regards
 
  On Tue, Mar 24, 2015 at 12:06 PM, Valentin kafka-...@sblk.de
 wrote:
 
  
   Hi guys,
  
   we have three Kafka use cases for which we have written our own PoC
   implementations,
   but where I am wondering whether there might be any fitting open source
   solution/tool/framework out there.
   Maybe someone of you has some ideas/pointers? :)
  
   1) Message routing/distribution/filter tool
   We need to copy messages from a set of input topics to a set of output
   topics
   based on their message key values. Each message in an input topic will
 go
   to 0 to N output topics,
   each output topic will receive messages from 0 to N input topics.
   So basically the tool acts as a message routing component in our
 system.
   Example configuration:
   input topic A:output topic K:key value 1,key value 2,key value
  3
   input topic A:output topic L:key value 2,key value 4
   input topic B:output topic K:key value 5,key value 6
   ...
   It would also be interesting to define distribution/filter rules based
 on
   regular expressions on the message key or message body.
  
   2) Kafka-to-REST Push service
   We need to consume messages from a set of topics, translate them into
  REST
   web service calls
   and forward the data to existing, non-Kafka-aware systems with REST
 APIs
   that way.
  
   3) HTTP REST API for consumers and producers
   We need to expose the simple consumer and the producer functionalities
  via
   REST web service calls,
   with authentication and per-topic-authorization on REST API level and
 TLS
   for transport encryption.
   Offset tracking is done by the connected systems, not by the
   broker/zookeeper/REST API.
   We expect a high message volume in the future here, so performance
 would
   be a key concern.
  
   Greetings
   Valentin
  
 
 
 
  --
  http://khangaonkar.blogspot.com/
 



 --
 Thanks,
 Nageswara Rao.V

 *The LORD reigns*




-- 
Thanks,
Ewen


Re: kafka.admin as separate module

2015-03-25 Thread Stevo Slavić
Ah, great, thanks for heads up Andrii!
On Mar 25, 2015 5:39 PM, Andrii Biletskyi andrii.bilets...@stealth.ly
wrote:

 Hi Stevo,

 JFYI: we are working now on new centralized API for Admin commands.
 This will include:
 - Public API to perform TopicCommands (Create/Alter/Delete/List/Describe)
 - Out-of-box java client for Admin API: AdminClient - will be part of
 /clients
 - Interactive cli for admin commands
 The plan was to have all mentioned above in 0.9 and probably remove
 existing
 tools at that point.

 You can check for details:
 Confluence -

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
 JIRA - https://issues.apache.org/jira/browse/KAFKA-1694
 Mailing list - thread [DISCUSS] KIP-4 - Command line and centralized
 administrative operations

 Thanks,
 Andrii Biletskyi


 On Wed, Mar 25, 2015 at 5:26 PM, Stevo Slavić ssla...@gmail.com wrote:

  Hello Apache Kafka community,
 
  I like that kafka-clients is now separate module, and has no scala
  dependency even. I'd like to propose that kafka.admin package gets
  published as separate module too.
 
  I'm writing some tests, and to be able to use kafka.admin tools/utils in
  them I have to bring in too large kafka module, with server stuff and all
  dependencies, like netty. Test framework happens to use netty too but
  different version - classpath hell.
 
  Any thoughts? Is proposal sound enough for a JIRA ticket?
 
  Kind regards,
  Stevo Slavic.
 



Re: lost messages -?

2015-03-25 Thread Mayuresh Gharat
DumpLogSegments will give you output something like this :

offset: 780613873770 isvalid: true payloadsize: 8055 magic: 1 compresscodec:
GZIPCompressionCodec

If this is what you want you can use the tool, to detect if the messages
are getting to your brokers.
Console-Consumer will output the messages for you.

Thanks,

Mayuresh


On Wed, Mar 25, 2015 at 9:33 AM, tao xiao xiaotao...@gmail.com wrote:

 You can use kafka-console-consumer consuming the topic from the beginning

 *kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
 --from-beginning*


 On Thu, Mar 26, 2015 at 12:17 AM, Victor L vlyamt...@gmail.com wrote:

  Can someone let me know how to dump contents of topics?
  I have producers sending messages to 3 brokers but about half of them
 don't
  seem to be consumed. I suppose they are getting stuck in queues but how
 can
  i figure out where?
  Thks,
 



 --
 Regards,
 Tao




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Producer Behavior When one or more Brokers' Disk is Full.

2015-03-25 Thread Bhavesh Mistry
Hello Kafka Community,



What is expected behavior on Producer side when one or more Brokers’  disk
is full, but have not reached retention period for topics (by size or by
time limit).



Does producer send data to that particular brokers and/or Producer Queue
gets full and always throws  Queue Full  or based on configuration (I have
producer with non-blocking setting when queue is full and ack are 0,1 and
retries set to 3).



What is expected behavior on OLD [Scala Based] vs Pure Java Based Producer ?


Here is reference to past discussion:
http://grokbase.com/t/kafka/users/147h4958k8/how-to-recover-from-a-disk-full-situation-in-kafka-cluster


Is there wiki or cookbook steps to recover from such situation ?



Thanks,

Bhavesh


Re: Kafka server relocation

2015-03-25 Thread Jiangjie Qin
If you want to do a seamless migration. I think a better way is to build a
cross datacenter Kafka cluster temporarily. So the process is:
1. Add several new Kafka brokers in your new datacenter and add them to
the old cluster.
2. Use replica assignment tool to reassign all the partitions to brokers
in new datacenter.
3. Perform controlled shutdown on the brokers in old datacenter.

Jiangjie (Becket) Qin

On 3/25/15, 2:01 PM, nitin sharma kumarsharma.ni...@gmail.com wrote:

Hi Team,

in my project, we have built a new datacenter for Kafka brokers and wants
to migrate from current datacenter to new one.

Switching producers and consumers wont be a problem provided New
Datacenter
has all the messages of existing Datacenter.


i have only 1 topic with 2 partition that need to be migrated... that too
it is only 1 time activity.

Kindly suggest the best way to deal with this situation.


Regards,
Nitin Kumar Sharma.



Kafka server relocation

2015-03-25 Thread nitin sharma
Hi Team,

in my project, we have built a new datacenter for Kafka brokers and wants
to migrate from current datacenter to new one.

Switching producers and consumers wont be a problem provided New Datacenter
has all the messages of existing Datacenter.


i have only 1 topic with 2 partition that need to be migrated... that too
it is only 1 time activity.

Kindly suggest the best way to deal with this situation.


Regards,
Nitin Kumar Sharma.


Re: Kafka server relocation

2015-03-25 Thread Mayuresh Gharat
You can use the Mirror maker to move data from one data center to other and
once all the data has been moved you can shut down the source data center
by doing a controlled shutdown.

Thanks,

Mayuresh

On Wed, Mar 25, 2015 at 2:35 PM, Jiangjie Qin j...@linkedin.com.invalid
wrote:

 If you want to do a seamless migration. I think a better way is to build a
 cross datacenter Kafka cluster temporarily. So the process is:
 1. Add several new Kafka brokers in your new datacenter and add them to
 the old cluster.
 2. Use replica assignment tool to reassign all the partitions to brokers
 in new datacenter.
 3. Perform controlled shutdown on the brokers in old datacenter.

 Jiangjie (Becket) Qin

 On 3/25/15, 2:01 PM, nitin sharma kumarsharma.ni...@gmail.com wrote:

 Hi Team,
 
 in my project, we have built a new datacenter for Kafka brokers and wants
 to migrate from current datacenter to new one.
 
 Switching producers and consumers wont be a problem provided New
 Datacenter
 has all the messages of existing Datacenter.
 
 
 i have only 1 topic with 2 partition that need to be migrated... that too
 it is only 1 time activity.
 
 Kindly suggest the best way to deal with this situation.
 
 
 Regards,
 Nitin Kumar Sharma.




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Get replication and partition count of a topic

2015-03-25 Thread srikannan
Ewen Cheslack-Postava ewen@... writes:

Im also searching the shortest way to find topic partition count, so that the 
initialization code in the thread pool can set up the right number of threads.

so far i found below is the shortest way. 


public static void main(String[] args){

int sessionTimeoutMs = 1;
int connectionTimeoutMs = 1;
ZkClient zkClient = new ZkClient(localhost:2181, 
sessionTimeoutMs, connectionTimeoutMs);

TopicMetadata metaData =  
AdminUtils.fetchTopicMetadataFromZk(forthtopic,zkClient);
System.out.println(metaData.partitionsMetadata().size());

}

kindly reply if you find any other shortest way.