Ashutosh Mestry created ATLAS-4155:
--------------------------------------

             Summary: NotificationHookConsumer: Large Compressed Message 
Processing Problem
                 Key: ATLAS-4155
                 URL: https://issues.apache.org/jira/browse/ATLAS-4155
             Project: Atlas
          Issue Type: Bug
            Reporter: Ashutosh Mestry
            Assignee: Ashutosh Mestry


*Background*

Notification messages can be large in size. To get over Kafka's limitation on 
message size, Atlas has compressed and split messages. If message size goes 
beyond stipulated threshold, the message is compressed. If compressed message 
goes beyond the size, it is split into multiple messages.

*Situation*

Consider a message that is so large that uncompressing it takes longer than 
Kafka's timeout for message. This causes the problem where the large message 
offset is not committed in time and that causes Kafka to present the same 
message again.

Message Description:
Number of splits: 8
Compressed message size: 7,452,640
Uncompressed message size: 520,803,946
Time taken to uncompress and stitch messages: > 90 seconds
 
Sequence:
2021-02-10 14:57:24,221: first message received
2021-02-10 14:58:36,052: all splits combined – 72 seconds
2021-02-10 15:01:06,971: message processing completed – 90 seconds
2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first message: 
197 seconds
2021-02-10 15:01:19,857: attempt #2: first message received
2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since 
first message: 205 seconds
Back to #5

*Solution*

Maintain last offset received. If the same offset is presented, commit the 
offset and move on to the next message.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to