[ 
https://issues.apache.org/jira/browse/ATLAS-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ernani Pereira de Mattos Junior updated ATLAS-3169:
---------------------------------------------------
    Description: 
One of the sources is creating a long lag, and hence the data from other 
sources are not getting processed until the lag is cleared. Is there a way to 
make a queue take priority?
 ==

The customer would like to have Kafka use some type of priority Queue so it 
would avoid the FIFO nature of Kafka's topics. This is regarding the ATLAS_HOOK 
in which the customer want other resources besides Hive h
 ook to have a faster processing.

Basically the idea is to have some resource to publish messages in different 
levels of priority.

1) high_priority_queue
 2) low_priority_queue

Atlas would have to consumer these messages accordingly to the Queue priority.

Business Case:

Topics or prioritizing messages for the ATLAS_HOOK topic.

HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
 OTHER_Services– publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1

note:  Hive is the bigger publisher, They have a LAG of about a week on the 
amount of messages produced by Hive. Currently the LAG produce by Hive impacts 
the other services publishers;

Some References:
 [https://stackoverflow.com/a/30686523]
 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
 - KAFKA-6690
 "Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data 
And Stream Processing At Scale

  was:
One of the sources is creating a long lag, and hence the data from other 
sources are not getting processed until the lag is cleared. Is there a way to 
make a queue take priority?
==

The customer would like to have Kafka use some type of priority Queue so it 
would avoid the FIFO nature of Kafka's topics. This is regarding the ATLAS_HOOK 
in which the customer want other resources besides Hive h
ook to have a faster processing.

Basically the idea is to have some resource to publish messages in different 
levels of priority.

1) high_priority_queue
2) low_priority_queue

Atlas would have to consumer these messages accordingly to the Queue priority.

Business Case:

Topics or prioritizing messages for the ATLAS_HOOK topic.

<MELD>_HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed --> 
ATLAS-1.1
OTHERCLUSTER_ATLAS-1.1 – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1

note: MELD Hive is the bigger publisher, They have a LAG of about a week on the 
amount of messages produced by Hive. Currently the LAG produce by Hive impacts 
the other services publishers;

Some References:
[https://stackoverflow.com/a/30686523]
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
 - KAFKA-6690
"Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data 
And Stream Processing At Scale


> Create priority Queue for Atlas messages coming from 2 different sources; 
> --------------------------------------------------------------------------
>
>                 Key: ATLAS-3169
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3169
>             Project: Atlas
>          Issue Type: Improvement
>          Components: atlas-intg
>    Affects Versions: 1.1.0
>            Reporter: Ernani Pereira de Mattos Junior
>            Priority: Critical
>
> One of the sources is creating a long lag, and hence the data from other 
> sources are not getting processed until the lag is cleared. Is there a way to 
> make a queue take priority?
>  ==
> The customer would like to have Kafka use some type of priority Queue so it 
> would avoid the FIFO nature of Kafka's topics. This is regarding the 
> ATLAS_HOOK in which the customer want other resources besides Hive h
>  ook to have a faster processing.
> Basically the idea is to have some resource to publish messages in different 
> levels of priority.
> 1) high_priority_queue
>  2) low_priority_queue
> Atlas would have to consumer these messages accordingly to the Queue priority.
> Business Case:
> Topics or prioritizing messages for the ATLAS_HOOK topic.
> HIVECLUSTER_ATLAS – publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
>  OTHER_Services– publish --> KAFKA-ATLAS_HOOK – subscribed --> ATLAS-1.1
> note:  Hive is the bigger publisher, They have a LAG of about a week on the 
> amount of messages produced by Hive. Currently the LAG produce by Hive 
> impacts the other services publishers;
> Some References:
>  [https://stackoverflow.com/a/30686523]
>  
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics]
>  - KAFKA-6690
>  "Single-Event Processing" - book:Kafka: The Definitive Guide: Real-Time Data 
> And Stream Processing At Scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to