[
https://issues.apache.org/jira/browse/ATLAS-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ernani Pereira de Mattos Junior updated ATLAS-3169:
---------------------------------------------------
Description:
One of the sources is creating a long lag, and hence the data from other
sources are not getting processed until the lag is cleared. Is there a way to
make a queue take priority?
==
The customer would like to have Kafka use some type of priority Queue so it
would avoid the FIFO nature of Kafka's topics. This is regarding the ATLAS_HOOK
in which the customer want other resources besides Hive h
ook to have a faster processing.
Basically the idea is to have some resource to publish messages in different
levels of priority.
1) high_priority_queue
2) low_priority_queue
Atlas would have to consumer these messages accordingly to the Queue priority.
Business Case:
Topics or prioritizing messages for the ATLAS_HOOK topic.
HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
OTHER_Services– publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
note: Hive is the bigger publisher, They have a LAG of about a week on the
amount of messages produced by Hive. Currently the LAG produce by Hive impacts
the other services publishers;
Some References:
[https://stackoverflow.com/a/30686523]
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
- KAFKA-6690
"Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data
And Stream Processing At Scale
was:
One of the sources is creating a long lag, and hence the data from other
sources are not getting processed until the lag is cleared. Is there a way to
make a queue take priority?
==
The customer would like to have Kafka use some type of priority Queue so it
would avoid the FIFO nature of Kafka's topics. This is regarding the ATLAS_HOOK
in which the customer want other resources besides Hive h
ook to have a faster processing.
Basically the idea is to have some resource to publish messages in different
levels of priority.
1) high_priority_queue
2) low_priority_queue
Atlas would have to consumer these messages accordingly to the Queue priority.
Business Case:
Topics or prioritizing messages for the ATLAS_HOOK topic.
<MELD>_HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed -->
ATLAS-1.1
OTHERCLUSTER_ATLAS-1.1 – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
note: MELD Hive is the bigger publisher, They have a LAG of about a week on the
amount of messages produced by Hive. Currently the LAG produce by Hive impacts
the other services publishers;
Some References:
[https://stackoverflow.com/a/30686523]
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
- KAFKA-6690
"Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data
And Stream Processing At Scale
> Create priority Queue for Atlas messages coming from 2 different sources;
> --------------------------------------------------------------------------
>
> Key: ATLAS-3169
> URL: https://issues.apache.org/jira/browse/ATLAS-3169
> Project: Atlas
> Issue Type: Improvement
> Components: atlas-intg
> Affects Versions: 1.1.0
> Reporter: Ernani Pereira de Mattos Junior
> Priority: Critical
>
> One of the sources is creating a long lag, and hence the data from other
> sources are not getting processed until the lag is cleared. Is there a way to
> make a queue take priority?
> ==
> The customer would like to have Kafka use some type of priority Queue so it
> would avoid the FIFO nature of Kafka's topics. This is regarding the
> ATLAS_HOOK in which the customer want other resources besides Hive h
> ook to have a faster processing.
> Basically the idea is to have some resource to publish messages in different
> levels of priority.
> 1) high_priority_queue
> 2) low_priority_queue
> Atlas would have to consumer these messages accordingly to the Queue priority.
> Business Case:
> Topics or prioritizing messages for the ATLAS_HOOK topic.
> HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
> OTHER_Services– publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
> note: Hive is the bigger publisher, They have a LAG of about a week on the
> amount of messages produced by Hive. Currently the LAG produce by Hive
> impacts the other services publishers;
> Some References:
> [https://stackoverflow.com/a/30686523]
>
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
> - KAFKA-6690
> "Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data
> And Stream Processing At Scale
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)