[jira] [Created] (HIVE-26458) Add explicit dependency to commons-dbcp2 in hive-exec module
Stamatis Zampetakis created HIVE-26458: -- Summary: Add explicit dependency to commons-dbcp2 in hive-exec module Key: HIVE-26458 URL: https://issues.apache.org/jira/browse/HIVE-26458 Project: Hive Issue Type: Task Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Hive CBO relies on Calcite so there is a direct dependency towards Calcite in hive-exec module. On its turn, Calcite needs commons-dbcp2 dependency in order to compile and run properly: https://github.com/apache/calcite/blob/b9c2099ea92a575084b55a206efc5dd341c0df62/core/build.gradle.kts#L69 In particular the dependency is necessary in order to use the JDBC adapter and some of its usages are shown below: * https://github.com/apache/calcite/blob/257c81b5cac35e29598a246463356fea7e0b0336/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcUtils.java#L29 * https://github.com/apache/calcite/blob/257c81b5cac35e29598a246463356fea7e0b0336/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcUtils.java#L262 However, due to the [shading of Calcite|https://github.com/apache/hive/blob/778c838317c952dcd273fd6c7a51491746a1d807/ql/pom.xml#L1075] inside hive-exec module all the transitive dependencies coming from Calcite must be defined explicitly otherwise they will not make it to the classpath. At the moment this does not pose a problem in master since {{commons-dbcp2}} dependency comes transitively from other modules. But in certain Hive branches with slightly different dependencies between modules we have seen failures like the one shown below: {noformat} java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: org/apache/commons/dbcp2/BasicDataSource at org.apache.calcite.adapter.jdbc.JdbcUtils$DataSourcePool.(JdbcUtils.java:213) at org.apache.calcite.adapter.jdbc.JdbcUtils$DataSourcePool.(JdbcUtils.java:210) at org.apache.calcite.adapter.jdbc.JdbcSchema.dataSource(JdbcSchema.java:207) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genTableLogicalPlan(CalcitePlanner.java:3331) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5324) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1815) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1750) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.plan(CalcitePlanner.java:1411) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:588) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13071) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:472) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:312) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:650) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:596) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:590) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:421) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:352) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:867) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:178) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:173) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccesso
[jira] [Created] (HIVE-26457) Upgrade package jetty to version 9.4.39+ to avoid CVE-2021-28165, CVE-2020-27216
Sai Hemanth Gantasala created HIVE-26457: Summary: Upgrade package jetty to version 9.4.39+ to avoid CVE-2021-28165, CVE-2020-27216 Key: HIVE-26457 URL: https://issues.apache.org/jira/browse/HIVE-26457 Project: Hive Issue Type: Bug Reporter: Sai Hemanth Gantasala -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26456) Remove stringifyException Method From Storage Handlers
David Mollitor created HIVE-26456: - Summary: Remove stringifyException Method From Storage Handlers Key: HIVE-26456 URL: https://issues.apache.org/jira/browse/HIVE-26456 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26455) Remove PowerMockito from hive-exec
Zsolt Miskolczi created HIVE-26455: -- Summary: Remove PowerMockito from hive-exec Key: HIVE-26455 URL: https://issues.apache.org/jira/browse/HIVE-26455 Project: Hive Issue Type: Improvement Components: Hive Reporter: Zsolt Miskolczi PowerMockito is a mockito extension that introduces some painful points. The main intention behind that is to be able to do static mocking. Since its release, mockito-inline has been released, as a part of the mockito-core. It doesn't require vintage test runner to be able to run and it can mock objects with their own thread. The goal is to stop using PowerMockito and use mockito-inline instead. The affected packages are: * org.apache.hadoop.hive.ql.exec.repl * org.apache.hadoop.hive.ql.exec.repl.bootstrap.load * org.apache.hadoop.hive.ql.exec.repl.ranger; * org.apache.hadoop.hive.ql.exec.util * org.apache.hadoop.hive.ql.parse.repl * org.apache.hadoop.hive.ql.parse.repl.load.message * org.apache.hadoop.hive.ql.parse.repl.metric * org.apache.hadoop.hive.ql.txn.compactor -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26454) BINARY types within complex types are not quoted
Csaba Ringhofer created HIVE-26454: -- Summary: BINARY types within complex types are not quoted Key: HIVE-26454 URL: https://issues.apache.org/jira/browse/HIVE-26454 Project: Hive Issue Type: Bug Reporter: Csaba Ringhofer While STRINGs are quoted and escaped, this is not done for BINARY members: select named_struct("s", "a", "b", cast("a" as binary)); result: {"s":"a","b":a} This is mainly problematic if special characters are involved, as this can lead to totally unparseble JSON: select named_struct("s", "a \"{", "b", cast("a \"{" as binary)); result: {"s":"a \"{","b":a "{} As existing workloads may rely on the current behavior, I think that it would be the best to add a configuration for this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26453) INSERT ... VALUES with BINARY type returns error
Csaba Ringhofer created HIVE-26453: -- Summary: INSERT ... VALUES with BINARY type returns error Key: HIVE-26453 URL: https://issues.apache.org/jira/browse/HIVE-26453 Project: Hive Issue Type: Bug Reporter: Csaba Ringhofer To reproduce: create table bint (b binary); insert into table bint values (cast("a" as binary)); Error: Error while compiling statement: FAILED: RuntimeException Error invoking signature method (state=42000,code=4) The same DML works if I use SELECT instead of values: insert into table bint select cast("a" as binary); -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26452) NPE when converting join to mapjoin and join column referenced more than once
Krisztian Kasa created HIVE-26452: - Summary: NPE when converting join to mapjoin and join column referenced more than once Key: HIVE-26452 URL: https://issues.apache.org/jira/browse/HIVE-26452 Project: Hive Issue Type: Bug Reporter: Krisztian Kasa Assignee: Krisztian Kasa {code} explain select count(*) from LU_CUSTOMER pa11 joinORDER_FACTa15 on (pa11.CUSTOMER_ID = a15.CUSTOMER_ID) joinLU_CUSTOMERa16 on (a15.CUSTOMER_ID = a16.CUSTOMER_ID and pa11.CUSTOMER_ID = a16.CUSTOMER_ID); {code} {{a16.CUSTOMER_ID}} is referenced more than once in the join condition. Hive generates Reduce sink operators for the join's children and one of the RS row schema contains only one instance of the join keys (customer_id). {code} RS[13] result = {HashMap@16092} size = 2 "KEY.reducesinkkey0" -> {ExprNodeColumnDesc@16083} "Column[_col0]" "KEY.reducesinkkey1" -> {ExprNodeColumnDesc@16102} "Column[_col0]" result = {RowSchema@16104} "(KEY.reducesinkkey0: int|{$hdt$_2}customer_id)" signature = {ArrayList@16110} size = 1 0 = {ColumnInfo@16087} "KEY.reducesinkkey0: int" {code} {{KEY.reducesinkkey1}} is missing from the schema. When converting the join to mapjoin the converter algorithm fails looking up both join key column instances. https://github.com/apache/hive/blob/2aaba3c79e740ef27fc263b5a8ff33ad679c5a12/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java#L538 -- This message was sent by Atlassian Jira (v8.20.10#820010)