[ 
https://issues.apache.org/jira/browse/ATLAS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312047#comment-15312047
 ] 

Vimal Sharma commented on ATLAS-801:
------------------------------------

Below are my thoughts:
Distributed Store and Forward with Sort: We can store failed Kafka messages 
along with the timestamp(when message was generated) on a distributed 
store(HBase). When Kafka server is up, we collect the messages and sort them 
based on timestamp(MapReduce job?). The sorted messages can then be published 
to Kafka topic.

However, in the meantime, next set of messages may also queue up. Depending on 
the number of messages which are received during the time of sorting, we can 
either go with in-memory sorting or another MapReduce job.

> Atlas hooks would lose messages if Kafka is down for extended period of time
> ----------------------------------------------------------------------------
>
>                 Key: ATLAS-801
>                 URL: https://issues.apache.org/jira/browse/ATLAS-801
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>
> All integration hooks in Atlas write messages to Kafka which are picked up by 
> the Atlas server. If communication to Kafka breaks, then this results in loss 
> of metadata messages. This can be mitigated to some extent using multiple 
> replicas for Kafka topics (see ATLAS-515). This JIRA is to see if we can make 
> this even more robust and have some form of store and forward mechanism for 
> increased fault tolerance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to