[ 
https://issues.apache.org/jira/browse/DRILL-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813007#comment-16813007
 ] 

Igor Guzenko commented on DRILL-540:
------------------------------------

Hi [~bbevens], 

Sounds very good, thank you. 

> Allow querying hive views in Drill
> ----------------------------------
>
>                 Key: DRILL-540
>                 URL: https://issues.apache.org/jira/browse/DRILL-540
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Hive
>            Reporter: Ramana Inukonda Nagaraj
>            Assignee: Igor Guzenko
>            Priority: Major
>              Labels: doc-impacting, ready-to-commit
>             Fix For: 1.16.0
>
>
> Currently hive views cannot be queried from drill.
> This Jira aims to add support for Hive views in Drill.
> *Implementation details:*
>  # Drill persists it's views metadata in file with suffix .view.drill using 
> json format. For example: 
> {noformat}
> {
>  "name" : "view_from_calcite_1_4",
>  "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
>  "fields" : [ {
>  "name" : "*",
>  "type" : "ANY",
>  "isNullable" : true
>  } ],
>  "workspaceSchemaPath" : [ "dfs", "tmp" ]
> }
> {noformat}
> Later Drill parses the metadata and uses it to treat view names in SQL as a 
> subquery.
>       2. In Apache Hive metadata about views is stored in similar way to 
> tables. Below is example from metastore.TBLS :
>  
> {noformat}
> TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID 
> |TBL_NAME  |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
> -------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
> 2      |1542111078  |1     |0                |mapr  |0         |2     |cview  
>    |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |
> {noformat}
>       3. So in Hive metastore views are considered as tables of special type. 
> And main benefit is that we also have expanded SQL definition of views (just 
> like in view.drill files). Also reading of the metadata is already 
> implemented in Drill with help of thrift Metastore API.
>       4. To enable querying of Hive views we'll reuse existing code for Drill 
> views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for 
> _*HiveReadEntry*_ we'll convert the metadata to instance of _*View*_ (_which 
> is actually model for data persisted in .view.drill files_) and then based on 
> this instance return new _*DrillViewTable*_. Using this approach drill will 
> handle hive views the same way as if it was initially defined in Drill and 
> persisted in .view.drill file. 
>      5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ 
> we'll reuse existing code from _*DrillHiveTable*_, so the conversion 
> functionality will be extracted and used for both (table and view) fields 
> type conversions. 
> *Security implications*
> Consider simple example case where we have users, 
> {code:java}
> user0  user1 user2
>            \ /
>           group12
> {code}
> and  sample db where object names contains user or group who should access 
> them      
> {code:java}
> db_all
>     tbl_user0
>     vw_user0
>     tbl_group12
>     vw_group12
> {code}
> There are two Hive authorization modes supported  by Drill - SQL Standart and 
> Strorage Based  authorization. For SQL Standart authorization permissions 
> were granted using SQL: 
> {code:java}
> SET ROLE admin;
> GRANT SELECT ON db_all.tbl_user0 TO USER user0;
> GRANT SELECT ON db_all.vw_user0 TO USER user0;
> CREATE ROLE group12;
> GRANT ROLE group12 TO USER user1;
> GRANT ROLE group12 TO USER user2;
> GRANT SELECT ON db_all.tbl_group12 TO ROLE group12;
> GRANT SELECT ON db_all.vw_group12 TO ROLE group12;
> {code}
> And for Storage based authorization permissions were granted using commands: 
> {code:java}
> hadoop fs -chown user0:user0 /user/hive/warehouse/db_all.db/tbl_user0
> hadoop fs -chmod 700 /user/hive/warehouse/db_all.db/tbl_user0
> hadoop fs -chmod 750 /user/hive/warehouse/db_all.db/tbl_group12
> hadoop fs -chown user1:group12 
> /user/hive/warehouse/db_all.db/tbl_group12{code}
>  Then the following table shows us results of queries for both authorization 
> models. 
>                                                                               
>                           *SQL Standart     |            Storage Based 
> Authorization*
> ||SQL||user0||user1||user2||   ||user0||user1||user2||
> |*Queries executed using Drill :*| | | | | | | |
> |SHOW TABLES IN hive.db_all;|   all|    all|   all| |Accessibe tables + all 
> views|Accessibe tables + all views|Accessibe tables + all views|
> |SELECT * FROM hive.db_all.tbl_user0;|   (/)|   (x)|   (x)| |        (/)|     
>    (x)|         (x)|
> |SELECT * FROM hive.db_all.vw_user0;|   (/)|   (x)|   (x)| |        (/)|      
>   (x)|         (x)|
> |SELECT * FROM hive.db_all.tbl_group12;|   (x)|   (/)|   (/)| |        (x)|   
>      (/)|         (/)|
> |SELECT * FROM hive.db_all.vw_group12;|   (x)|   (/)|   (/)| |        (x)|    
>     (/)|         (/)|
> |SELECT * FROM INFORMATION_SCHEMA.`TABLES`
>  WHERE TABLE_SCHEMA = 'hive.db_all';|   all|  all|  all| |Accessibe tables + 
> all views|Accessibe tables + all views|Accessibe tables + all views|
> |DESCRIBE hive.db_all.tbl_user0;|   (/)|   (x)|   (x)| |        (/)|         
> (x)|         (x)|
> |DESCRIBE hive.db_all.vw_user0;|   (/)|   (x) |   (x)| |        (/)|         
> (/)|         (/)|
> |DESCRIBE hive.db_all.tbl_group12;|   (x)|   (/)|   (/)| |        (x)|        
>  (/) |         (/)|
> |DESCRIBE hive.db_all.vw_group12;|   (x)|   (/)|   (/)| |        (/)|         
> (/)|         (/)|
> |*Queries executed using Beeline and Hive CLI:*| | | | | | | |
> |SHOW TABLES IN db_all;|   all|   all|   all| |        all|          all|     
>      all|
> |SELECT * FROM hive.db_all.tbl_user0;|   (/)|   (x) |   (x)| |        (/)|    
>      (x)|         (x)|
> |SELECT * FROM db_all.vw_user0;|   (/)|   (x)|   (x)| |        (/)|         
> (x)|         (x)|
> |SELECT * FROM db_all.tbl_group12;|   (x)|   (/)|   (/)| |        (x)|        
>  (/)|         (/)|
> |SELECT * FROM db_all.vw_group12;|   (x)|   (/)|   (/)| |        (x)|         
> (/)|         (/)|
> | | | | | | | | |
> |DESCRIBE db_all.tbl_user0;|   (/)|   (x) |   (x)| |        (/)|         (x)| 
>         (x)|
> |DESCRIBE db_all.vw_user0;|   (/)|   (x)|   (x)| |        (/)|         (/)|   
>       (/)|
> |DESCRIBE db_all.tbl_group12;|   (x)|   (/)|   (/)| |        (x)|         
> (/)|         (/)|
> |DESCRIBE db_all.vw_group12;|   (x) |   (/)|   (/)| |        (/)|         
> (/)|         (/)|
>  (!)  *Warning:*  Because views in Hive aren't present as physical files and 
> access can't be granted using file system commands, then access to Hive views 
> for *Storage Based Authorization* is based on the underlying tables used in 
> view definition. For current example views were defined as selection over 
> appropriate tables. 
> *For documentation*
>  Remove note from 
> [https://drill.apache.org/docs/querying-the-information-schema/] that states:
> {noformat}
> NOTE
> Currently, Drill only supports querying Drill views; Hive views are not yet 
> supported.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to