[ 
https://issues.apache.org/jira/browse/CHUKWA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shreyas subramanya updated CHUKWA-707:
--------------------------------------

    Attachment: CHUKWA-707.patch1

I have created the first patch of Kafka integration and uploaded it for review. 
It currently makes use of Kafka as a replacement for the in-memory chunk queue 
in Chukwa. The flow is as follows: 
Adaptor -> KafkaQueue -> KafkaBroker
KafkaConnector -> multiple KafkaConsumer threads -> PipelineWriters
(each KafkaConsumer sets up a pipeline)

The following configurations are needed:
 conf/chukwa-agent-conf.xml 
  -> chukwaAgent.chunk.queue = 
org.apache.hadoop.chukwa.datacollection.agent.KafkaQueue (this sets up the 
kafka producer)
  -> chukwa.agent.connector = 
org.apache.hadoop.chukwa.datacollection.connector.kafka.KafkaConnector (this 
sets up the kafka consumer)
 conf/consumer.properties
 conf/producer.properties

Each data type will be a new topic on kafka.

I am working on the improving the following areas:
1. Partitioning the topics so that we can have parallelism in a consumer group
2. Making the key format configurable

> Replace Chukwa collector with Apache Kafka
> ------------------------------------------
>
>                 Key: CHUKWA-707
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-707
>             Project: Chukwa
>          Issue Type: New Feature
>            Reporter: Eric Yang
>            Assignee: shreyas subramanya
>         Attachments: CHUKWA-707.patch1
>
>
> Chukwa collector has stopped evolving since 2010.  Newer framework has offer 
> better features of message queues, and Apache Kafka looks like a good 
> replacement for Chukwa collector.
> Chukwa agent can implement a connector to Apache Kafka to replace Chukwa 
> collector, and HBase consumer to write data to HBase.  HICC REST API change 
> to new HBase storage format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to