[ 
https://issues.apache.org/jira/browse/RANGER-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alberto Romero updated RANGER-2127:
-----------------------------------
    Summary: Hive masking policy to allow querying actual data if masked cols 
are on predicate  (was: Hive Masking policy to allow querying actual data if 
masked cols are on predicate)

> Hive masking policy to allow querying actual data if masked cols are on 
> predicate
> ---------------------------------------------------------------------------------
>
>                 Key: RANGER-2127
>                 URL: https://issues.apache.org/jira/browse/RANGER-2127
>             Project: Ranger
>          Issue Type: Improvement
>          Components: plugins, Ranger
>    Affects Versions: 0.7.1
>            Reporter: Alberto Romero
>            Priority: Major
>
> Enable querying datasets on actual values even if there are masking policies 
> for some of the columns, as long as such columns are in the predicate of the 
> statement (ie the actual masked values are not returned, only used as part of 
> the query).
> For example,
> Table #1: customers
> Cols: ID, COMPANY_NAME, SECTOR, *masked*[CUSTOMER_ID]
> Table #2: orders
> Cols: ID, WAREHOUSE_ID, QUANTITY, VALUE
>  
> SELECT c.SECTOR, sum(o.QUANTITY)
> FROM customers c JOIN orders o 
> ON (c.CUSTOMER_ID = o.ID) 
> group by c.SECTOR;
>  
> The query would not return the values from customers.CUSTOMER_ID, but will 
> still run the query on the actual values for the column.
>  
> A suggested approach would be to create a simple query analyzer to determine 
> if any masked data would be returned or not based on the position in the 
> query, and allow the query to run on actual values if not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to