[
https://issues.apache.org/jira/browse/HADOOP-13633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Huafeng Wang updated HADOOP-13633:
----------------------------------
Attachment: IntroduceApacheKafkaasaServiceinHadoop.pdf
Here is the draft design document.
Great thanks to [~zhz], [~drankye], [~rakeshr], [~umamaheswararao],
[~hayabusa], [~zhouwei] for the co-work on this design.
Any advice or comment on the design is appreciated.
> Introduce Apache Kafka as a Service into Hadoop
> -----------------------------------------------
>
> Key: HADOOP-13633
> URL: https://issues.apache.org/jira/browse/HADOOP-13633
> Project: Hadoop Common
> Issue Type: New Feature
> Reporter: Huafeng Wang
> Assignee: Huafeng Wang
> Attachments: IntroduceApacheKafkaasaServiceinHadoop.pdf
>
>
> In HDFS-7343 we want to develop a comprehensive storage management solution
> originated from community discussions, in order for allowing convenient,
> intelligent and effective utilization of various HDFS facilities such as
> erasure coding, HDFS cache, HSM offering, and etc. based on valuable insights
> from events and data collected from namenodes, datanodes, frameworks and
> applications via a pub-sub messaging system. In HDFS-8940 it was discussed
> that the proposed large scale inotify feature would be better to be
> implemented via Kafka system to allowing thousands of consumers or inotify
> clients.
> Apache Kafka is a distributed messaging system that aims to provide a
> unified, high-throughput, low-latency platform for handling real-time data
> feeds, and currently it’s widely used in real-time streaming process field.
> Considering the above two important use cases desired in Hadoop, we’d like to
> propose to introduce Kafka as a fundamental event pub-sub service into Hadoop
> platform. Like FileSystem offering, we’d like to provide MessagingSystem in
> Hadoop style and conforming Hadoop security, backed by an internal or
> external existing Kafka cluster. Generally the new service is very convenient
> to use, and can be used to distribute and exchange various types of events
> across IO, storage, and computation that produced by Hadoop itself,
> frameworks or applications on top of it. Then on this basis valuable events
> can be analyzed in a centralized way so that meaningful applications and
> usages can be developed.
> The design document is under-going and will be submitted in a week. Feedback
> are very welcome. Thanks!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]