Ashutosh Mestry created ATLAS-2634:
--------------------------------------
Summary: Large Notification Messages: Avoid Processing of Already
Processed Messages
Key: ATLAS-2634
URL: https://issues.apache.org/jira/browse/ATLAS-2634
Project: Atlas
Issue Type: Bug
Components: atlas-core
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
Fix For: trunk
*Scenario*
If a hook encounters messages that have size larger than what Kafka can handle,
it has either compresses or splits or does both to break down the message in a
size that Kafka can handle.
When Atlas encounters such a message as part of processing messages from the
hook, it uses appropriate strategy to get the message back in the correct
format.
When a message of this type is processed, there is a possibility that the
processing will go on for over the threshold mandated by Kafka for commit. If
the processing exceeds the threshold, Kafka will resend that message.
This causes the message to be reprocessed.
Given this, it is possible that the message may be stuck in the queue forever
or at the very least, it is re-processed several times (at least twice).
*Solution*
* Record the message Ids for large messages.
** For messages with no version number, calculate MD5 hash of the message and
use that as message id.
* If a message with same Id is encountered again, commit the same, without
processing.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)