[
https://issues.apache.org/jira/browse/CHUKWA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shreyas subramanya updated CHUKWA-707:
--------------------------------------
Attachment: CHUKWA-707.patch1
I have created the first patch of Kafka integration and uploaded it for review.
It currently makes use of Kafka as a replacement for the in-memory chunk queue
in Chukwa. The flow is as follows:
Adaptor -> KafkaQueue -> KafkaBroker
KafkaConnector -> multiple KafkaConsumer threads -> PipelineWriters
(each KafkaConsumer sets up a pipeline)
The following configurations are needed:
conf/chukwa-agent-conf.xml
-> chukwaAgent.chunk.queue =
org.apache.hadoop.chukwa.datacollection.agent.KafkaQueue (this sets up the
kafka producer)
-> chukwa.agent.connector =
org.apache.hadoop.chukwa.datacollection.connector.kafka.KafkaConnector (this
sets up the kafka consumer)
conf/consumer.properties
conf/producer.properties
Each data type will be a new topic on kafka.
I am working on the improving the following areas:
1. Partitioning the topics so that we can have parallelism in a consumer group
2. Making the key format configurable
> Replace Chukwa collector with Apache Kafka
> ------------------------------------------
>
> Key: CHUKWA-707
> URL: https://issues.apache.org/jira/browse/CHUKWA-707
> Project: Chukwa
> Issue Type: New Feature
> Reporter: Eric Yang
> Assignee: shreyas subramanya
> Attachments: CHUKWA-707.patch1
>
>
> Chukwa collector has stopped evolving since 2010. Newer framework has offer
> better features of message queues, and Apache Kafka looks like a good
> replacement for Chukwa collector.
> Chukwa agent can implement a connector to Apache Kafka to replace Chukwa
> collector, and HBase consumer to write data to HBase. HICC REST API change
> to new HBase storage format.
--
This message was sent by Atlassian JIRA
(v6.2#6252)