Fang-Yu Rao created IMPALA-15094:
------------------------------------

             Summary: Support the resource type of DATA SOURCE once RANGER-4952 
is available
                 Key: IMPALA-15094
                 URL: https://issues.apache.org/jira/browse/IMPALA-15094
             Project: IMPALA
          Issue Type: Task
            Reporter: Fang-Yu Rao
            Assignee: Fang-Yu Rao


While reviewing [https://gerrit.cloudera.org/c/24442/1/SECURITY.md] 
(IMPALA-15092: Update threat model for URIs), I just realized it would be good 
to register a privilege request for the data source given in the SELECT query 
against a data source. Specifically, I found that statements like select 
count(*) from test_db.alltypes_datasource only registers the following 
privilege requests.

 
{code:java}
privilegeReqs = {LinkedHashSet@8747}  size = 2
 0 = {PrivilegeRequest@8774}
  authorizable_ = {AuthorizableTable@8776}
   dbName_ = "test_db"
   tableName_ = "alltypes_datasource"
   ownerUser_ = "admin"
  privilege_ = {Privilege@8777} "SELECT"
  grantOption_ = false
 1 = {PrivilegeRequest@8775}
  authorizable_ = {AuthorizableDb@8782}
   dbName_ = "_impala_builtins"
   ownerUser_ = null
  privilege_ = {Privilege@8783} "VIEW_METADATA"
  grantOption_ = false
{code}
 

For completeness, the following is what I used to create the data source.
{code:java}
CREATE DATA SOURCE data_src
LOCATION '/test-warehouse/data-sources/test-data-source.jar'
CLASS 'org.apache.impala.extdatasource.AllTypesDataSource'
API_VERSION 'V1';

CREATE TABLE test_db.alltypes_datasource (
  id INT,
  bool_col BOOLEAN,
  tinyint_col TINYINT,
  smallint_col SMALLINT,
  int_col INT,
  bigint_col BIGINT,
  float_col FLOAT,
  double_col DOUBLE,
  timestamp_col TIMESTAMP,
  string_col STRING,
  dec_col1 DECIMAL(9,0),
  dec_col2 DECIMAL(10,0),
  dec_col3 DECIMAL(20,10),
  dec_col4 DECIMAL(38,37),
  dec_col5 DECIMAL(10,5),
  date_col DATE)
PRODUCED BY DATA SOURCE data_src("TestInitString");
{code}
 

I found that during the execution of the SELECT query above, code in 
[https://github.com/apache/impala/blob/ec10428/java/ext-data-source/test/src/main/java/org/apache/impala/extdatasource/AllTypesDataSource.java]
 would be executed. So it looks to me the SELECT query involves the execution 
of methods/functions defined by users. I recall that we require SELECT 
privilege to execute a UDF 
([https://github.com/apache/impala/commit/4cbc295c48de1b369a78ec266137dfc3b363e7ec]
 IMPALA-10986: Require the SELECT privilege to execute a UDF) so we should do 
so for data source too.

 

This ticket would also depend on 
https://issues.apache.org/jira/browse/RANGER-4952 which added in Apache Hive 
the support for DATA CONNECTOR, which is very similar to the concept of DATA 
SOURCE in Apache Impala.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to