----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/73239/ -----------------------------------------------------------
(Updated April 9, 2021, 11:41 p.m.) Review request for atlas, Madhan Neethiraj and Sarath Subramanian. Changes ------- Updates include: Addressed review comments. Bugs: ATLAS-4204 https://issues.apache.org/jira/browse/ATLAS-4204 Repository: atlas Description ------- **Background** Please see JIRA. Terms and abbreviations used: - Database objects: Database, table, views, etc. - DDL: Data Definition Langugage. Database parlance. Operations performed on a database server that result in creation of objects that hold data. - DML: Data Manipulation Language. Database parlance. Operations performed on a database server that result in manipulation (add, update, delete) of data. These are performed using database objects that are already created. Hive hook is implemented by: - _HiveHook_ - _HiveMetastoreImpl_ - _BaseHiveEvent_ has 2 methods: - getHiveMetastoreEntities: Called by HMS events. - getHiveEntities: Called by HS2 events. **Approach** - Introduce a flag _skipDDL_ indicating if entities generated as part of DDL operation should be skipped by a hook. This flag is set as a configuration parameters: atlas.hive.hook.ignore.ddl.operations - Continue using HMS event processing unchanged. - New: _ActiveEntityFilter_: Initializes entity filter based on the skipDDL setting. - New: _LineageOnlyFilter_: Retains entities that participate in lineage. - New: _PassthroughFilter_: Does not do any filtering. Diffs (updated) ----- addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/AtlasHiveHookContext.java 128647147 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 79e87c79d addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/BaseHiveEvent.java 7c269ce53 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/utils/ActiveEntityFilter.java PRE-CREATION addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/utils/EntityFilter.java PRE-CREATION addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/utils/HiveDDLEntityFilter.java PRE-CREATION addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/utils/PassthroughFilter.java PRE-CREATION addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/utils/ActiveEntityFilterTest.java PRE-CREATION addons/hive-bridge/src/test/resources/atlas-application.properties 898b69c99 addons/hive-bridge/src/test/resources/json/hs2-alter-view-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-alter-view.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-db-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-db.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-process-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-process.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-table-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-create-table.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-drop-db-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-drop-db.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-drop-table-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-drop-table.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-table-rename-v2.json PRE-CREATION addons/hive-bridge/src/test/resources/json/hs2-table-rename.json PRE-CREATION Diff: https://reviews.apache.org/r/73239/diff/7/ Changes: https://reviews.apache.org/r/73239/diff/6-7/ Testing ------- **Functional tests** - Used specific queries to exercise the affected code paths. - Verified messages going out of the hooks, onces posted on ATLAS_HOOK and the entities created within Atlas. **Test data** ``` create database cadb02; use cadb02; create external table hh6(col1 int) location '/tmp/external/hh6.csv'; ALTER TABLE hh6 RENAME TO hh6_renamed; create view hh6_renamed_view as select * from hh6_renamed; ALTER VIEW hh6_renamed_view RENAME TO hh6_renamed_view2; create external table bb1(col1 int) location '/tmp/bb1.csv'; ALTER TABLE bb1 CHANGE COLUMN col1 col11 string; ALTER TABLE bb1 SET SERDEPROPERTIES ('field.delim' = '|'); ALTER TABLE bb1 ADD COLUMNS (dept STRING COMMENT 'Department name'); ALTER TABLE bb1 RENAME TO cc1; ``` **Volume test** +----------+-------------+--------------+ | Hook | Baseline | New | +----------+-------------+--------------+ | HMS | 31 KB. | 31 KB | +----------+-------------+--------------+ | HS2 | 53 KB. | 12 KB | +----------+-------------+--------------+ Even though the size of payload has reduced, the overall time taken for the volumne of entities created has not changed. **Unit test** New tests added. Thanks, Ashutosh Mestry