[jira] [Created] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access
Adesh Kumar Rao created HIVE-24247: -- Summary: StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access Key: HIVE-24247 URL: https://issues.apache.org/jira/browse/HIVE-24247 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Adesh Kumar Rao Assignee: Adesh Kumar Rao Fix For: 4.0.0 StorageBasedAuthorizationProvider uses {noformat} FileSystem.access(Path, Action) {noformat} method to check the access. This method gets the FileStatus object and checks access based on that. ACL's are not present in FileStatus. Instead, Hive should use {noformat} FileSystem.get(path.toUri(), conf); {noformat} {noformat} .access(Path, Action) {noformat} where the implemented file system can do the access checks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
Zhihua Deng created HIVE-24248: -- Summary: TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky Key: HIVE-24248 URL: https://issues.apache.org/jira/browse/HIVE-24248 Project: Hive Issue Type: Bug Reporter: Zhihua Deng [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests] {code:java} java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing subquery_join_rewrite.q 241,244d240 < 1 1 < 1 2 < 2 1 < 2 2 245a242,243 > 2 2 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name
Aasha Medhi created HIVE-24246: -- Summary: Fix for Ranger Deny policy overriding policy with same resource name Key: HIVE-24246 URL: https://issues.apache.org/jira/browse/HIVE-24246 Project: Hive Issue Type: Task Reporter: Aasha Medhi Assignee: Aasha Medhi -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24251) Improve bloom filter size estimation for multi column semijoin reducers
Stamatis Zampetakis created HIVE-24251: -- Summary: Improve bloom filter size estimation for multi column semijoin reducers Key: HIVE-24251 URL: https://issues.apache.org/jira/browse/HIVE-24251 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis There are various cases where the expected size of the bloom filter is largely underestimated making the semijoin reducer completely ineffective. This more relevant for multi-column semi join reducers since the current [code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273] does not take them into account. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24252) Improve decision model for using semijoin reducers
Stamatis Zampetakis created HIVE-24252: -- Summary: Improve decision model for using semijoin reducers Key: HIVE-24252 URL: https://issues.apache.org/jira/browse/HIVE-24252 Project: Hive Issue Type: Improvement Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis After a few experiments with TPC-DS 10TB dataset, we observed that in some cases semijoin reducers were not effective; they didn't reduce the number of records or they reduced the relation only a tiny bit. In some cases we can make the semijoin reducer more effective by adding more columns but this requires also a bigger bloom filter so the decision for the number of columns to include in the bloom becomes more delicate. The current decision model always chooses multi-column semijoin reducers if they are available but this may not always beneficial if the a single column can reduce significantly the target relation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows
Naresh P R created HIVE-24255: - Summary: StorageHandler with select-limit query is returning 0 rows Key: HIVE-24255 URL: https://issues.apache.org/jira/browse/HIVE-24255 Project: Hive Issue Type: Bug Reporter: Naresh P R Assignee: Naresh P R {code:java} CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name string, owner_name string, owner_type string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); ==> Wrong Result <== set hive.limit.optimize.enable=true; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 0 0 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s -- ++--+---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | ++--+---+-+-+ ++--+---+-+-+ ==> Correct Result <== set hive.limit.optimize.enable=false; select * from test_table limit 1; -- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. container SUCCEEDED 1 1 0 0 0 0 -- VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s -- +++---+-+-+ | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type | +++---+-+-+ | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | ROLE | {code} +++---+-+-+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24250) CREATE DATABASE with MANAGEDLOCATION set requires super user priv on location
Viacheslav Avramenko created HIVE-24250: --- Summary: CREATE DATABASE with MANAGEDLOCATION set requires super user priv on location Key: HIVE-24250 URL: https://issues.apache.org/jira/browse/HIVE-24250 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0 Reporter: Viacheslav Avramenko At [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1485] Folder for Database is created as superuser, instead of impersonated one. This leads to issues when impersonation is on (doAs =true) and superuser have no access to location specified in MANAGEDLOCATION of CREATE DATABASE. If I am, as a user, specify a location where to create my database, I should have write access there. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24249) Create View fails if a materialized view exists with the same query
Krisztian Kasa created HIVE-24249: - Summary: Create View fails if a materialized view exists with the same query Key: HIVE-24249 URL: https://issues.apache.org/jira/browse/HIVE-24249 Project: Hive Issue Type: Bug Reporter: Krisztian Kasa Assignee: Krisztian Kasa {code:java} create table t1(col0 int) STORED AS ORC TBLPROPERTIES ('transactional'='true'); create materialized view mv1 as select * from t1 where col0 > 2; create view mv1 as select sub.* from (select * from t1 where col0 > 2) sub where sub.col0 = 10; {code} The planner realize that the view definition has a subquery which match the materialized view query and replaces it to the materialized view scan. {code:java} HiveProject($f0=[CAST(10):INTEGER]) HiveFilter(condition=[=(10, $0)]) HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1]) {code} Then exception is thrown: {code:java} org.apache.hadoop.hive.ql.parse.SemanticException: View definition references materialized view default.mv1 at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211) at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at
[jira] [Created] (HIVE-24253) HMS needs to support keystore/truststores types besides JKS
Yongzhi Chen created HIVE-24253: --- Summary: HMS needs to support keystore/truststores types besides JKS Key: HIVE-24253 URL: https://issues.apache.org/jira/browse/HIVE-24253 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Yongzhi Chen Assignee: Yongzhi Chen When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support the default keystore type specified for the JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support to set additional keystore/truststore types used for different applications like for FIPS crypto algorithms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24254) Remove setOwner call in ReplChangeManager
Aasha Medhi created HIVE-24254: -- Summary: Remove setOwner call in ReplChangeManager Key: HIVE-24254 URL: https://issues.apache.org/jira/browse/HIVE-24254 Project: Hive Issue Type: Task Reporter: Aasha Medhi Assignee: Aasha Medhi -- This message was sent by Atlassian Jira (v8.3.4#803005)
failed-to-read failures in precommit
Hey All! Lately some interesting error started to appear in precommit runs as "TEST-org.apache.hadoop.hive.cli.split0.TestMiniLlapLocalCliDriver.xml.[failed-to-read]" see [1] for an example. I've digged into it - and it was caused by a failed test (it was acid_stats2 in the one I was looking into) which could happen. But the resulting junit xml is malformed because of some surefire issue (I've opened [2] to investiaget it). The junit result xml writing is essentially interrupted in the middle... ...and in the end for some reason jenkins doesn't consider the read failure "serious enough" to change the state to yellow...so in the end the test results are green. I'm trying to upgrade surefire to 3.0.0-M5 - but I'm a bit skeptical that it will solve the problem.. The best would be to fix this issue - however if that doesnt seem to be easy; I'll add some hack to at least fail the build in these cases. cheers, Zoltan [1] http://ci.hive.apache.org/job/hive-precommit/job/master/274/testReport/ [2] https://issues.apache.org/jira/browse/SUREFIRE-1852