[
https://issues.apache.org/jira/browse/ATLAS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312047#comment-15312047
]
Vimal Sharma commented on ATLAS-801:
------------------------------------
Below are my thoughts:
Distributed Store and Forward with Sort: We can store failed Kafka messages
along with the timestamp(when message was generated) on a distributed
store(HBase). When Kafka server is up, we collect the messages and sort them
based on timestamp(MapReduce job?). The sorted messages can then be published
to Kafka topic.
However, in the meantime, next set of messages may also queue up. Depending on
the number of messages which are received during the time of sorting, we can
either go with in-memory sorting or another MapReduce job.
> Atlas hooks would lose messages if Kafka is down for extended period of time
> ----------------------------------------------------------------------------
>
> Key: ATLAS-801
> URL: https://issues.apache.org/jira/browse/ATLAS-801
> Project: Atlas
> Issue Type: Improvement
> Reporter: Hemanth Yamijala
> Assignee: Hemanth Yamijala
>
> All integration hooks in Atlas write messages to Kafka which are picked up by
> the Atlas server. If communication to Kafka breaks, then this results in loss
> of metadata messages. This can be mitigated to some extent using multiple
> replicas for Kafka topics (see ATLAS-515). This JIRA is to see if we can make
> this even more robust and have some form of store and forward mechanism for
> increased fault tolerance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)