[jira] [Created] (HIVE-26458) Add explicit dependency to commons-dbcp2 in hive-exec module

2022-08-05 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-26458:
--

 Summary: Add explicit dependency to commons-dbcp2 in hive-exec 
module
 Key: HIVE-26458
 URL: https://issues.apache.org/jira/browse/HIVE-26458
 Project: Hive
  Issue Type: Task
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


Hive CBO relies on Calcite so there is a direct dependency towards Calcite in 
hive-exec module. On its turn, Calcite needs commons-dbcp2 dependency in order 
to compile and run properly:

https://github.com/apache/calcite/blob/b9c2099ea92a575084b55a206efc5dd341c0df62/core/build.gradle.kts#L69

In particular the dependency is necessary in order to use the JDBC adapter and 
some of its usages are shown below:
* 
https://github.com/apache/calcite/blob/257c81b5cac35e29598a246463356fea7e0b0336/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcUtils.java#L29
* 
https://github.com/apache/calcite/blob/257c81b5cac35e29598a246463356fea7e0b0336/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcUtils.java#L262


However, due to the [shading of 
Calcite|https://github.com/apache/hive/blob/778c838317c952dcd273fd6c7a51491746a1d807/ql/pom.xml#L1075]
 inside hive-exec module all the transitive dependencies coming from Calcite 
must be defined explicitly otherwise they will not make it to the classpath.

At the moment this does not pose a problem in master since {{commons-dbcp2}} 
dependency comes transitively from other modules. But in certain Hive branches 
with slightly different dependencies between modules we have seen failures like 
the one shown below:

{noformat}
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: 
org/apache/commons/dbcp2/BasicDataSource
at 
org.apache.calcite.adapter.jdbc.JdbcUtils$DataSourcePool.(JdbcUtils.java:213)
at 
org.apache.calcite.adapter.jdbc.JdbcUtils$DataSourcePool.(JdbcUtils.java:210)
at 
org.apache.calcite.adapter.jdbc.JdbcSchema.dataSource(JdbcSchema.java:207)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genTableLogicalPlan(CalcitePlanner.java:3331)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5324)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1815)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1750)
at 
org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.plan(CalcitePlanner.java:1411)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:588)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13071)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:472)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:312)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:650)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:596)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:590)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:421)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:352)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:867)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:178)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:173)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 

[jira] [Created] (HIVE-26457) Upgrade package jetty to version 9.4.39+ to avoid CVE-2021-28165, CVE-2020-27216

2022-08-05 Thread Sai Hemanth Gantasala (Jira)
Sai Hemanth Gantasala created HIVE-26457:


 Summary: Upgrade package jetty to version 9.4.39+ to avoid 
CVE-2021-28165, CVE-2020-27216
 Key: HIVE-26457
 URL: https://issues.apache.org/jira/browse/HIVE-26457
 Project: Hive
  Issue Type: Bug
Reporter: Sai Hemanth Gantasala






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26456) Remove stringifyException Method From Storage Handlers

2022-08-05 Thread David Mollitor (Jira)
David Mollitor created HIVE-26456:
-

 Summary: Remove stringifyException Method From Storage Handlers
 Key: HIVE-26456
 URL: https://issues.apache.org/jira/browse/HIVE-26456
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26455) Remove PowerMockito from hive-exec

2022-08-05 Thread Zsolt Miskolczi (Jira)
Zsolt Miskolczi created HIVE-26455:
--

 Summary: Remove PowerMockito from hive-exec
 Key: HIVE-26455
 URL: https://issues.apache.org/jira/browse/HIVE-26455
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Zsolt Miskolczi


PowerMockito is a mockito extension that introduces some painful points. 

The main intention behind that is to be able to do static mocking. Since its 
release, mockito-inline has been released, as a part of the mockito-core. 
It doesn't require vintage test runner to be able to run and it can mock 
objects with their own thread. 

The goal is to stop using PowerMockito and use mockito-inline instead.

 

The affected packages are: 
 * org.apache.hadoop.hive.ql.exec.repl
 * org.apache.hadoop.hive.ql.exec.repl.bootstrap.load
 * org.apache.hadoop.hive.ql.exec.repl.ranger;
 * org.apache.hadoop.hive.ql.exec.util
 * org.apache.hadoop.hive.ql.parse.repl
 * org.apache.hadoop.hive.ql.parse.repl.load.message
 * org.apache.hadoop.hive.ql.parse.repl.metric
 * org.apache.hadoop.hive.ql.txn.compactor

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26454) BINARY types within complex types are not quoted

2022-08-05 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created HIVE-26454:
--

 Summary: BINARY types within complex types are not quoted
 Key: HIVE-26454
 URL: https://issues.apache.org/jira/browse/HIVE-26454
 Project: Hive
  Issue Type: Bug
Reporter: Csaba Ringhofer


While STRINGs are quoted and escaped, this is not done for BINARY members:
select named_struct("s", "a", "b", cast("a" as binary));
result: {"s":"a","b":a}

This is mainly problematic if special characters are involved, as this can lead 
to totally unparseble JSON:
select named_struct("s", "a \"{", "b", cast("a \"{" as binary));
result: {"s":"a \"{","b":a "{}

As existing workloads may rely on the current behavior, I think that it would 
be the best to add a configuration for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26453) INSERT ... VALUES with BINARY type returns error

2022-08-05 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created HIVE-26453:
--

 Summary: INSERT ... VALUES with BINARY type returns error
 Key: HIVE-26453
 URL: https://issues.apache.org/jira/browse/HIVE-26453
 Project: Hive
  Issue Type: Bug
Reporter: Csaba Ringhofer


To reproduce:
create table bint (b binary);
insert into table bint values (cast("a" as binary));
Error: Error while compiling statement: FAILED: RuntimeException Error invoking 
signature method (state=42000,code=4)

The same DML works if I use SELECT instead of values:
insert into table bint select cast("a" as binary);



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26452) NPE when converting join to mapjoin and join column referenced more than once

2022-08-05 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-26452:
-

 Summary: NPE when converting join to mapjoin and join column 
referenced more than once
 Key: HIVE-26452
 URL: https://issues.apache.org/jira/browse/HIVE-26452
 Project: Hive
  Issue Type: Bug
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


{code}
explain
select count(*)
from LU_CUSTOMER pa11
  joinORDER_FACTa15
  on (pa11.CUSTOMER_ID = a15.CUSTOMER_ID)
  joinLU_CUSTOMERa16
  on (a15.CUSTOMER_ID = a16.CUSTOMER_ID and pa11.CUSTOMER_ID = 
a16.CUSTOMER_ID);
{code}
{{a16.CUSTOMER_ID}} is referenced more than once in the join condition.

Hive generates Reduce sink operators for the join's children and one of the RS 
row schema contains only one instance of the join keys (customer_id).
{code}
RS[13]
result = {HashMap@16092}  size = 2
 "KEY.reducesinkkey0" -> {ExprNodeColumnDesc@16083} "Column[_col0]"
 "KEY.reducesinkkey1" -> {ExprNodeColumnDesc@16102} "Column[_col0]" 
   
 
 
result = {RowSchema@16104} "(KEY.reducesinkkey0: int|{$hdt$_2}customer_id)"
 signature = {ArrayList@16110}  size = 1
  0 = {ColumnInfo@16087} "KEY.reducesinkkey0: int"
{code}

{{KEY.reducesinkkey1}} is missing from the schema.

When converting the join to mapjoin the converter algorithm fails looking up 
both join key column instances.

https://github.com/apache/hive/blob/2aaba3c79e740ef27fc263b5a8ff33ad679c5a12/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java#L538



--
This message was sent by Atlassian Jira
(v8.20.10#820010)