Ashutosh Mestry created ATLAS-4155:
--------------------------------------
Summary: NotificationHookConsumer: Large Compressed Message
Processing Problem
Key: ATLAS-4155
URL: https://issues.apache.org/jira/browse/ATLAS-4155
Project: Atlas
Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
*Background*
Notification messages can be large in size. To get over Kafka's limitation on
message size, Atlas has compressed and split messages. If message size goes
beyond stipulated threshold, the message is compressed. If compressed message
goes beyond the size, it is split into multiple messages.
*Situation*
Consider a message that is so large that uncompressing it takes longer than
Kafka's timeout for message. This causes the problem where the large message
offset is not committed in time and that causes Kafka to present the same
message again.
Message Description:
Number of splits: 8
Compressed message size: 7,452,640
Uncompressed message size: 520,803,946
Time taken to uncompress and stitch messages: > 90 seconds
Sequence:
2021-02-10 14:57:24,221: first message received
2021-02-10 14:58:36,052: all splits combined – 72 seconds
2021-02-10 15:01:06,971: message processing completed – 90 seconds
2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first message:
197 seconds
2021-02-10 15:01:19,857: attempt #2: first message received
2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since
first message: 205 seconds
Back to #5
*Solution*
Maintain last offset received. If the same offset is presented, commit the
offset and move on to the next message.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)