[jira] [Created] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access

2020-10-09 Thread Adesh Kumar Rao (Jira)
Adesh Kumar Rao created HIVE-24247:
--

 Summary: StorageBasedAuthorizationProvider does not look into 
Hadoop ACL while check for access
 Key: HIVE-24247
 URL: https://issues.apache.org/jira/browse/HIVE-24247
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Adesh Kumar Rao
Assignee: Adesh Kumar Rao
 Fix For: 4.0.0


StorageBasedAuthorizationProvider uses
{noformat}
FileSystem.access(Path, Action)
{noformat}
method to check the access.

This method gets the FileStatus object and checks access based on that. ACL's 
are not present in FileStatus.

 

Instead, Hive should use
{noformat}
FileSystem.get(path.toUri(), conf);
{noformat}
{noformat}
.access(Path, Action)
{noformat}
where the implemented file system can do the access checks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24248:
--

 Summary: TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
 Key: HIVE-24248
 URL: https://issues.apache.org/jira/browse/HIVE-24248
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
{code:java}
java.lang.AssertionError:
Client Execution succeeded but contained differences (error code = 1) after 
executing subquery_join_rewrite.q
241,244d240
< 1 1
< 1 2
< 2 1
< 2 2
245a242,243
> 2 2
{code}
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-09 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-24246:
--

 Summary: Fix for Ranger Deny policy overriding policy with same 
resource name 
 Key: HIVE-24246
 URL: https://issues.apache.org/jira/browse/HIVE-24246
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24251) Improve bloom filter size estimation for multi column semijoin reducers

2020-10-09 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-24251:
--

 Summary: Improve bloom filter size estimation for multi column 
semijoin reducers
 Key: HIVE-24251
 URL: https://issues.apache.org/jira/browse/HIVE-24251
 Project: Hive
  Issue Type: Improvement
  Components: Query Planning
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


There are various cases where the expected size of the bloom filter is largely 
underestimated  making the semijoin reducer completely ineffective. This more 
relevant for multi-column semi join reducers since the current 
[code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273]
 does not take them into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24252) Improve decision model for using semijoin reducers

2020-10-09 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-24252:
--

 Summary: Improve decision model for using semijoin reducers
 Key: HIVE-24252
 URL: https://issues.apache.org/jira/browse/HIVE-24252
 Project: Hive
  Issue Type: Improvement
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


After a few experiments with TPC-DS 10TB dataset, we observed that in some 
cases semijoin reducers were not effective; they didn't reduce the number of 
records or they reduced the relation only a tiny bit. 

In some cases we can make the semijoin reducer more effective by adding more 
columns but this requires also a bigger bloom filter so the decision for the 
number of columns to include in the bloom becomes more delicate.

The current decision model always chooses multi-column semijoin reducers if 
they are available but this may not always beneficial if the a single column 
can reduce significantly the target relation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread Naresh P R (Jira)
Naresh P R created HIVE-24255:
-

 Summary: StorageHandler with select-limit query is returning 0 rows
 Key: HIVE-24255
 URL: https://issues.apache.org/jira/browse/HIVE-24255
 Project: Hive
  Issue Type: Bug
Reporter: Naresh P R
Assignee: Naresh P R


 
{code:java}
CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name 
string, owner_name string, owner_type string)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT 
`DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`');
==> Wrong Result <==
set hive.limit.optimize.enable=true;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
--
++--+---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++--+---+-+-+
++--+---+-+-+
==> Correct Result <==
set hive.limit.optimize.enable=false;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 1 1 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s
--
+++---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
+++---+-+-+
| 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | 
ROLE |
{code}

+++---+-+-+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24250) CREATE DATABASE with MANAGEDLOCATION set requires super user priv on location

2020-10-09 Thread Viacheslav Avramenko (Jira)
Viacheslav Avramenko created HIVE-24250:
---

 Summary: CREATE DATABASE with MANAGEDLOCATION set requires super 
user priv on location
 Key: HIVE-24250
 URL: https://issues.apache.org/jira/browse/HIVE-24250
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 4.0.0
Reporter: Viacheslav Avramenko


At  
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1485]

Folder for Database is created as superuser, instead of impersonated one.

This leads to issues when impersonation is on (doAs =true) and superuser have 
no access to location specified in MANAGEDLOCATION of CREATE DATABASE.

If I am, as a user, specify a location where to create my database, I should 
have write access there.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24249) Create View fails if a materialized view exists with the same query

2020-10-09 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-24249:
-

 Summary: Create View fails if a materialized view exists with the 
same query
 Key: HIVE-24249
 URL: https://issues.apache.org/jira/browse/HIVE-24249
 Project: Hive
  Issue Type: Bug
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


{code:java}
create table t1(col0 int) STORED AS ORC
  TBLPROPERTIES ('transactional'='true');

create materialized view mv1 as
select * from t1 where col0 > 2;

create view mv1 as
select sub.* from (select * from t1 where col0 > 2) sub
where sub.col0 = 10;
{code}
The planner realize that the view definition has a subquery which match the 
materialized view query and replaces it to the materialized view scan.
{code:java}
HiveProject($f0=[CAST(10):INTEGER])
  HiveFilter(condition=[=(10, $0)])
HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1])
{code}
Then exception is thrown:
{code:java}
 org.apache.hadoop.hive.ql.parse.SemanticException: View definition references 
materialized view default.mv1
at 
org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211)
at 
org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at 

[jira] [Created] (HIVE-24253) HMS needs to support keystore/truststores types besides JKS

2020-10-09 Thread Yongzhi Chen (Jira)
Yongzhi Chen created HIVE-24253:
---

 Summary: HMS needs to support keystore/truststores types besides 
JKS
 Key: HIVE-24253
 URL: https://issues.apache.org/jira/browse/HIVE-24253
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen


When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
the default keystore type specified for the JDK and not always use JKS. Same as 
HIVE-23958 for hive, HMS should support to set additional keystore/truststore 
types used for different applications like for FIPS crypto algorithms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-24254:
--

 Summary: Remove setOwner call in ReplChangeManager
 Key: HIVE-24254
 URL: https://issues.apache.org/jira/browse/HIVE-24254
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


failed-to-read failures in precommit

2020-10-09 Thread Zoltan Haindrich

Hey All!

Lately some interesting error started to appear in precommit runs as 
"TEST-org.apache.hadoop.hive.cli.split0.TestMiniLlapLocalCliDriver.xml.[failed-to-read]"
see [1] for an example.

I've digged into it - and it was caused by a failed test (it was acid_stats2 in 
the one I was looking into) which could happen.
But the resulting junit xml is malformed because of some surefire issue (I've opened [2] to investiaget it). The junit result xml writing is essentially interrupted in the 
middle...


...and in the end for some reason jenkins doesn't consider the read failure "serious 
enough" to change the state to yellow...so in the end the test results are green.

I'm trying to upgrade surefire to 3.0.0-M5 - but I'm a bit skeptical that it 
will solve the problem..

The best would be to fix this issue - however if that doesnt seem to be easy; 
I'll add some hack to at least fail the build in these cases.

cheers,
Zoltan

[1] http://ci.hive.apache.org/job/hive-precommit/job/master/274/testReport/
[2] https://issues.apache.org/jira/browse/SUREFIRE-1852