[
https://issues.apache.org/jira/browse/ATLAS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashutosh Mestry updated ATLAS-4204:
-----------------------------------
Component/s: hive-integration
> Hive Hook: Improve HS2 Message Sending
> --------------------------------------
>
> Key: ATLAS-4204
> URL: https://issues.apache.org/jira/browse/ATLAS-4204
> Project: Atlas
> Issue Type: Improvement
> Components: hive-integration
> Reporter: Ashutosh Mestry
> Assignee: Ashutosh Mestry
> Priority: Major
>
> *Background*
> HiveServer2 hook for Atlas sends notification message for both metadata (DDL
> operations) and lineage (DML operations).
> Hive Metastore (HMS) hook already sends metadata information to Atlas. These
> messages are all DDL operations.
> So duplicate messages about object updates are sent to Atlas.
> Atlas processes these messages like any other.
> This is additional processing time and increased volume. There is also a
> potential of incorrect data being updated within Atlas if the sequence of
> messages from HMS and HS2 gets changed.
> *Solution*
> This improvement will send only lineage messages from HS2 hook. All the DDL
> (schema definition) messages will continue be sent from HMS hook (no change
> here).
> This will also reduce the volume of messages sent to Atlas from hive server2
> and will help improve performance by avoiding processing duplicate messages.
> The improvement can be used via a configuration parameter. That way existing
> behavior continues as is.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)