Ashutosh Mestry created ATLAS-2634:
--------------------------------------

             Summary: Large Notification Messages: Avoid Processing of Already 
Processed Messages
                 Key: ATLAS-2634
                 URL: https://issues.apache.org/jira/browse/ATLAS-2634
             Project: Atlas
          Issue Type: Bug
          Components:  atlas-core
    Affects Versions: trunk
            Reporter: Ashutosh Mestry
            Assignee: Ashutosh Mestry
             Fix For: trunk


*Scenario*

If a hook encounters messages that have size larger than what Kafka can handle, 
it has either compresses or splits or does both to break down the message in a 
size that Kafka can handle.

When Atlas encounters such a message as part of processing messages from the 
hook, it uses appropriate strategy to get the message back in the correct 
format.

When a message of this type is processed, there is a possibility that the 
processing will go on for over the threshold mandated by Kafka for commit. If 
the processing exceeds the threshold, Kafka will resend that message.

This causes the message to be reprocessed. 

Given this, it is possible that the message may be stuck in the queue forever 
or at the very least, it is re-processed several times (at least twice).

 

*Solution*
 * Record the message Ids for large messages.
 ** For messages with no version number, calculate MD5 hash of the message and 
use that as message id.
 * If a message with same Id is encountered again, commit the same, without 
processing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to