Alberto Romero created RANGER-2127:
--------------------------------------
Summary: Masking policy to allow querying actual data if masked
cols are on predicate
Key: RANGER-2127
URL: https://issues.apache.org/jira/browse/RANGER-2127
Project: Ranger
Issue Type: Improvement
Components: plugins, Ranger
Affects Versions: 0.7.1
Reporter: Alberto Romero
Enable querying datasets on actual values even if there are masking policies
for some of the columns, as long as such columns are in the predicate of the
statement (ie the actual masked values are not returned, only used as part of
the query).
For example,
Table #1: customers
Cols: ID, COMPANY_NAME, SECTOR, *masked*[CUSTOMER_ID]
Table #2: orders
Cols: ID, WAREHOUSE_ID, QUANTITY, VALUE
SELECT c.SECTOR, sum(o.QUANTITY)
FROM customers c JOIN orders o
ON (c.CUSTOMER_ID = o.ID)
group by c.SECTOR;
The query would not return the values from customers.CUSTOMER_ID, but will
still run the query on the actual values for the column.
A suggested approach would be to create a simple query analyzer to determine if
any masked data would be returned or not based on the position in the query,
and allow the query to run on actual values if not.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)