Fang-Yu Rao created IMPALA-15094:
------------------------------------
Summary: Support the resource type of DATA SOURCE once RANGER-4952
is available
Key: IMPALA-15094
URL: https://issues.apache.org/jira/browse/IMPALA-15094
Project: IMPALA
Issue Type: Task
Reporter: Fang-Yu Rao
Assignee: Fang-Yu Rao
While reviewing [https://gerrit.cloudera.org/c/24442/1/SECURITY.md]
(IMPALA-15092: Update threat model for URIs), I just realized it would be good
to register a privilege request for the data source given in the SELECT query
against a data source. Specifically, I found that statements like select
count(*) from test_db.alltypes_datasource only registers the following
privilege requests.
{code:java}
privilegeReqs = {LinkedHashSet@8747} size = 2
0 = {PrivilegeRequest@8774}
authorizable_ = {AuthorizableTable@8776}
dbName_ = "test_db"
tableName_ = "alltypes_datasource"
ownerUser_ = "admin"
privilege_ = {Privilege@8777} "SELECT"
grantOption_ = false
1 = {PrivilegeRequest@8775}
authorizable_ = {AuthorizableDb@8782}
dbName_ = "_impala_builtins"
ownerUser_ = null
privilege_ = {Privilege@8783} "VIEW_METADATA"
grantOption_ = false
{code}
For completeness, the following is what I used to create the data source.
{code:java}
CREATE DATA SOURCE data_src
LOCATION '/test-warehouse/data-sources/test-data-source.jar'
CLASS 'org.apache.impala.extdatasource.AllTypesDataSource'
API_VERSION 'V1';
CREATE TABLE test_db.alltypes_datasource (
id INT,
bool_col BOOLEAN,
tinyint_col TINYINT,
smallint_col SMALLINT,
int_col INT,
bigint_col BIGINT,
float_col FLOAT,
double_col DOUBLE,
timestamp_col TIMESTAMP,
string_col STRING,
dec_col1 DECIMAL(9,0),
dec_col2 DECIMAL(10,0),
dec_col3 DECIMAL(20,10),
dec_col4 DECIMAL(38,37),
dec_col5 DECIMAL(10,5),
date_col DATE)
PRODUCED BY DATA SOURCE data_src("TestInitString");
{code}
I found that during the execution of the SELECT query above, code in
[https://github.com/apache/impala/blob/ec10428/java/ext-data-source/test/src/main/java/org/apache/impala/extdatasource/AllTypesDataSource.java]
would be executed. So it looks to me the SELECT query involves the execution
of methods/functions defined by users. I recall that we require SELECT
privilege to execute a UDF
([https://github.com/apache/impala/commit/4cbc295c48de1b369a78ec266137dfc3b363e7ec]
IMPALA-10986: Require the SELECT privilege to execute a UDF) so we should do
so for data source too.
This ticket would also depend on
https://issues.apache.org/jira/browse/RANGER-4952 which added in Apache Hive
the support for DATA CONNECTOR, which is very similar to the concept of DATA
SOURCE in Apache Impala.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)