Repository: flume Updated Branches: refs/heads/flume-1.6 9cc850825 -> d8968a1f6
FLUME-2455. Kafka Sink Documentation. (Thilina Buddika, Gwen Shapira via Hari) Project: http://git-wip-us.apache.org/repos/asf/flume/repo Commit: http://git-wip-us.apache.org/repos/asf/flume/commit/d8968a1f Tree: http://git-wip-us.apache.org/repos/asf/flume/tree/d8968a1f Diff: http://git-wip-us.apache.org/repos/asf/flume/diff/d8968a1f Branch: refs/heads/flume-1.6 Commit: d8968a1f6af4067a406dc95912ea083bd3c366b2 Parents: 9cc8508 Author: Hari Shreedharan <[email protected]> Authored: Tue Sep 23 22:33:55 2014 -0700 Committer: Hari Shreedharan <[email protected]> Committed: Tue Sep 23 22:36:50 2014 -0700 ---------------------------------------------------------------------- flume-ng-doc/sphinx/FlumeUserGuide.rst | 54 +++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flume/blob/d8968a1f/flume-ng-doc/sphinx/FlumeUserGuide.rst ---------------------------------------------------------------------- diff --git a/flume-ng-doc/sphinx/FlumeUserGuide.rst b/flume-ng-doc/sphinx/FlumeUserGuide.rst index 3a47560..11c1ad7 100644 --- a/flume-ng-doc/sphinx/FlumeUserGuide.rst +++ b/flume-ng-doc/sphinx/FlumeUserGuide.rst @@ -2137,6 +2137,60 @@ auth.proxyUser -- The effective user for HDFS actions, if differ the kerberos principal ======================= ======= =========================================================== + +Kafka Sink +~~~~~~~~~~ +This is a Flume Sink implementation that can publish data to a +`Kafka <http://kafka.apache.org/>`_ topic. One of the objective is to integrate Flume +with Kafka so that pull based processing systems can process the data coming +through various Flume sources. This currently supports Kafka 0.8.x series of releases. + +Required properties are marked in bold font. + + +=============================== =================== ============================================================================================= +Property Name Default Description +=============================== =================== ============================================================================================= +**type** -- Must be set to ``org.apache.flume.sink.kafka.KafkaSink`` +**kafka.metadata.broker.list** -- List of brokers Kafka-Sink will connect to, to get the list of topic partitions + This can be a partial list of brokers, but we recommend at least two for HA. + The format is comma separated list of hostname:port +topic default-flume-topic The topic in Kafka to which the messages will be published. If this parameter is configured, + messages will be published to this topic. + If the event header contains a "topic" field, the event will be published to that topic + overriding the topic configured here. +batchSize 100 How many messages to process in one batch. Larger batches improve throughput while adding latency. +kafka.request.required.acks 0 How many replicas must acknowledge a message before its considered successfully written. + Accepted values are 0 (Never wait for acknowledgement), 1 (wait for leader only), -1 (wait for all replicas) + The default is the fastest option, but we *highly recommend* setting this to -1 to avoid data loss +kafka.producer.type sync Whether messages should be sent to broker synchronously or using an asynchronous background thread. + Accepted values are sync (safest) and async (faster but potentially unsafe) +Other Kafka Producer Properties -- These properties are used to configure the Kafka Producer. Any producer property supported + by Kafka can be used. The only requirement is to prepend the property name with the prefix ``kafka.``. +=============================== =================== ============================================================================================= + +.. note:: Kafka Sink uses the ``topic`` and ``key`` properties from the FlumeEvent headers to send events to Kafka. + If ``topic`` exists in the headers, the event will be sent to that specific topic, overriding the topic configured for the Sink. + If ``key`` exists in the headers, the key will used by Kafka to partition the data between the topic partitions. Events with same key + will be sent to the same partition. If the key is null, events will be sent to random partitions. + +An example configuration of a Kafka sink is given below. Properties starting +with the prefix ``kafka`` (the last 3 properties) are used when instantiating +the Kafka producer. The properties that are passed when creating the Kafka +producer are not limited to the properties given in this example. +Also it's possible include your custom properties here and access them inside +the preprocessor through the Flume Context object passed in as a method +argument. + +.. code-block:: properties + + a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink + a1.sinks.k1.topic = mytopic + a1.sinks.k1.kafka.metadata.broker.list = localhost:9092 + a1.sinks.k1.kafka.request.required.acks = 1 + a1.sinks.k1.batchSize = 20 + a1.sinks.k1.channel = c1 + Custom Sink ~~~~~~~~~~~
