[
https://issues.apache.org/jira/browse/ATLAS-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sarath Subramanian updated ATLAS-4155:
--------------------------------------
Component/s: atlas-core
> NotificationHookConsumer: Large Compressed Message Processing Problem
> ---------------------------------------------------------------------
>
> Key: ATLAS-4155
> URL: https://issues.apache.org/jira/browse/ATLAS-4155
> Project: Atlas
> Issue Type: Bug
> Components: atlas-core
> Affects Versions: 2.1.0
> Reporter: Ashutosh Mestry
> Assignee: Ashutosh Mestry
> Priority: Major
> Attachments: ATLAS-4155-Kafka-commit-supplied-offset.patch
>
>
> *Background*
> Notification messages can be large in size. To get over Kafka's limitation on
> message size, Atlas has compressed and split messages. If message size goes
> beyond stipulated threshold, the message is compressed. If compressed message
> goes beyond the size, it is split into multiple messages.
> *Situation*
> Consider a message that is so large that uncompressing it takes longer than
> Kafka's timeout for message. This causes the problem where the large message
> offset is not committed in time and that causes Kafka to present the same
> message again.
> Message Description:
> Number of splits: 8
> Compressed message size: 7,452,640
> Uncompressed message size: 520,803,946
> Time taken to uncompress and stitch messages: > 90 seconds
>
> Sequence:
> 2021-02-10 14:57:24,221: first message received
> 2021-02-10 14:58:36,052: all splits combined – 72 seconds
> 2021-02-10 15:01:06,971: message processing completed – 90 seconds
> 2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first
> message: 197 seconds
> 2021-02-10 15:01:19,857: attempt #2: first message received
> 2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
> 2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since
> first message: 205 seconds
> Back to #5
> *Solution*
> Maintain last offset received. If the same offset is presented, commit the
> offset and move on to the next message.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)