Chiran Ravani created ATLAS-3085:
------------------------------------
Summary: Capturing Hive DMLs through HiveHook can expose sensitive
data
Key: ATLAS-3085
URL: https://issues.apache.org/jira/browse/ATLAS-3085
Project: Atlas
Issue Type: Bug
Components: atlas-core
Affects Versions: 1.1.0
Reporter: Chiran Ravani
When Atlas HiveHook is enabled, it captures DML statements which can have
sensitive data and the entity is created under type hive_process with the name
as DML Statement itself.
Steps to reproduce:
{code:java}
CREATE TABLE test_hive_atlas
(SSIN int,
name string)
CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC
TBLPROPERTIES('transactional'='true');
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.support.concurrency=true;
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.strict.locking.mode=true;
insert into test_hive_atlas(12398431, 'Name1');
insert into test_hive_atlas values(342198432, 'Name2');{code}
After running the above statements, an entity is created in atlas under
hive_process with its name as insert statement, a configuration to disable DMLs
for HiveHook will help.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)