[jira] [Created] (HIVE-21921) Support for correlated quantified predicates

2019-06-24 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-21921:
--

 Summary: Support for correlated quantified predicates
 Key: HIVE-21921
 URL: https://issues.apache.org/jira/browse/HIVE-21921
 Project: Hive
  Issue Type: New Feature
  Components: Query Planning
Reporter: Vineet Garg
Assignee: Vineet Garg
 Attachments: HIVE-21921.1.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21920) Extract authorisation from the Driver

2019-06-24 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-21920:
-

 Summary: Extract authorisation from the Driver
 Key: HIVE-21920
 URL: https://issues.apache.org/jira/browse/HIVE-21920
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.1.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


There are ~400 lines of command authorisation in the Driver class, which are 
also used by ExplainTask. Extract them into a separate package under  
org.apache.hadoop.hive.ql.security.authorization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21919) Refactor Driver

2019-06-24 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-21919:
-

 Summary: Refactor Driver
 Key: HIVE-21919
 URL: https://issues.apache.org/jira/browse/HIVE-21919
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


The Driver class is 3000+ lines long. It does a lot of things, it's structure 
is hard to follow. Need to put it into a cleaner form, and thus make it more 
readable. It should be cut into many pieces for having separate classes for 
different subtasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21918) handle each Alter Database types in a separate desc / operation

2019-06-24 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-21918:
-

 Summary: handle each Alter Database types in a separate desc / 
operation
 Key: HIVE-21918
 URL: https://issues.apache.org/jira/browse/HIVE-21918
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


AlterDatabaseDesc / AlterDatabaseOperation handles all kind of alter database 
commands. Following the logic of the DDL handling, they should be in separate 
classes, with a mini framework handled by abstract ancestors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21917) COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compator runs

2019-06-24 Thread Craig Condit (JIRA)
Craig Condit created HIVE-21917:
---

 Summary: COMPLETED_TXN_COMPONENTS table is never cleaned up unless 
Compator runs
 Key: HIVE-21917
 URL: https://issues.apache.org/jira/browse/HIVE-21917
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1, 3.1.0
Reporter: Craig Condit


The Initiator thread in the metastore repeatedly loops over entries in the 
COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
need to be compacted. However, entries are never removed from this table except 
by a completed Compactor run.

In a cluster where most tables / partitions are write-once read-many, this 
results in stale entries in this table never being cleaned up. In a small test 
cluster, we have observed approximately 45k entries in this table (virtually 
equal to the number of partitions in the cluster) while < 100 of these tables 
have delta files at all. Since most of the tables will never get enough writes 
to trigger a compaction (and in fact have only ever been written to once), the 
initiator thread keeps trying to evaluate them on every loop.

On this test cluster, it takes approximately 10 minutes to loop through all the 
entries and results in severe performance degradation on metastore operations. 
With the default run timing of 5 minutes, the initiator basically never stops 
running.

On a production cluster with 2M partitions, this would be a non-starter.

The initiator thread should proactively remove entries from 
COMPLETED_TXN_COMPONENTS when it determines that a compaction is not needed, so 
that they are not evaluated again on the next loop.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70934: HIVE-18735: Create table like loses transactional attribute.

2019-06-24 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70934/#review216090
---




ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 13593 (patched)


You do not need return statement here. tblProps reference is used under 
updateDefaultTblProps.


- Denys Kuzmenko


On June 24, 2019, 1:01 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70934/
> ---
> 
> (Updated June 24, 2019, 1:01 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Marta Kuczora, Peter Vary, and Adam 
> Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18735: Create table like loses transactional attribute.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> d395db1b59d021789b1bb47c7f09ff337cba2dd0 
> 
> 
> Diff: https://reviews.apache.org/r/70934/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Review Request 70934: HIVE-18735: Create table like loses transactional attribute.

2019-06-24 Thread Laszlo Pinter via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70934/
---

Review request for hive, Eugene Koifman, Marta Kuczora, Peter Vary, and Adam 
Szita.


Repository: hive-git


Description
---

HIVE-18735: Create table like loses transactional attribute.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
d395db1b59d021789b1bb47c7f09ff337cba2dd0 


Diff: https://reviews.apache.org/r/70934/diff/1/


Testing
---


Thanks,

Laszlo Pinter



[jira] [Created] (HIVE-21916) Avoid overflow because of casting in case of the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)
Attila Zsolt Piros created HIVE-21916:
-

 Summary: Avoid overflow because of casting in case of the "ceil", 
"ceiling" and "floor" SQL functions
 Key: HIVE-21916
 URL: https://issues.apache.org/jira/browse/HIVE-21916
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Attila Zsolt Piros


The ceil, ceiling and floor SQL functions return type is long and this leads to 
overflow:
{code:java}
hive> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
OK
4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 
   92233720368547758079223372036854775807


{code}
 

Meanwhile at other SQL engines.



*PostgreSQL*:

 

{{postgres=# select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
ceiling | floor 
--+---
 
+-
 
--+---
 

 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
12345678901234
 
000
 | 
1234567890123400
 
0
 | 
12345678901234
 
000
 (1 row)}}


{{}}

*MySQL*{{:}}{{}}

 
{code:java}
mysql> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
+---+---+---+---+
 | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
floor(1.2345678901234e+200) | 
+---+---+---+---+
 | 5.7.26 | 
12345678901234000
 | 
12345678901234000
 | 
12345678901234000
 | 

[jira] [Created] (HIVE-21915) Hive with TEZ UNION ALL and UDTF results in data loss

2019-06-24 Thread Wei Zhang (JIRA)
Wei Zhang created HIVE-21915:


 Summary: Hive with TEZ UNION ALL and UDTF results in data loss
 Key: HIVE-21915
 URL: https://issues.apache.org/jira/browse/HIVE-21915
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Wei Zhang


The HQL syntax is like this:

CREATE TEMPORARY TABLE tez_union_all_loss_data AS
SELECT xxx, yyy, zzz,1 as tag
FROM ods_1

UNION ALL

SELECT xxx, yyy, zzz, tag
FROM
(
SELECT xxx
,get_json_object(get_json_object(tb,'$.a'),'$.b') AS yyy
,zzz
,2 as tag
FROM ods_2
LATERAL VIEW EXPLODE(some_udf(uuu)) team_number AS tb
) tbl 
;

 

With above HQL, we are expecting that rows with both tag = 2 and tag = 1 
appear. In our case however, all the rows with tag = 1 are lost.

Dig deeper we can find that the generated two maps have identical task tmp 
paths. And that results from when UDTF is present, the FileSinkOperator will be 
processed twice generating the tmp path in GenTezUtils.removeUnionOperators();

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70929: Move Function and Macro related DDL operations into the DDL framework

2019-06-24 Thread Miklos Gergely

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70929/
---

(Updated June 24, 2019, 8:41 a.m.)


Review request for hive and Zoltan Haindrich.


Bugs: HIVE-21914
https://issues.apache.org/jira/browse/HIVE-21914


Repository: hive-git


Description
---

Some Function and Macro related operations are handled by FunctionTask, and 
FunctionWork while they belong to the DDL framework.


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/FunctionLocalizer.java
 136fe2a3b3 
  
ql/src/java/org/apache/hadoop/hive/ql/ddl/function/CreateFunctionOperation.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ddl/function/CreateMacroDesc.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ddl/function/CreateMacroOperation.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ddl/function/DropFunctionOperation.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ddl/function/DropMacroOperation.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/ddl/function/ReloadFunctionsOperation.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 2061cf4577 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionUtils.java 200e26c310 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java f4a46e62cf 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 7eeca5f90a 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2ae1db57aa 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java 
74a4be4535 
  ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
2cfcc6b611 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g e0230055ef 
  ql/src/java/org/apache/hadoop/hive/ql/parse/MacroSemanticAnalyzer.java 
88b6068941 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
51a6b2a918 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/CreateFunctionHandler.java
 3a32885d1d 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/DropFunctionHandler.java
 fee2bb5dbf 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateFunctionDesc.java 92c00ca027 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateMacroDesc.java f34c02c77e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DropFunctionDesc.java d3415a5799 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DropMacroDesc.java 2b0c683bc2 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FunctionWork.java cacbe62183 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReloadFunctionDesc.java ac1fba210e 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
2189ad6bee 
  ql/src/test/org/apache/hadoop/hive/ql/plan/TestCreateMacroDesc.java 
48906f70dd 
  ql/src/test/org/apache/hadoop/hive/ql/plan/TestDropMacroDesc.java 2ee27dc89b 
  ql/src/test/queries/clientpositive/create_func1.q eef07f64e5 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
55e66f8ab8 
  ql/src/test/results/clientnegative/create_function_nonudf_class.q.out 
6d5427ef59 
  ql/src/test/results/clientnegative/create_unknown_genericudf.q.out ad1371d6a5 
  ql/src/test/results/clientnegative/create_unknown_udf_udaf.q.out bfb72b4415 
  ql/src/test/results/clientnegative/ivyDownload.q.out e1fe8233cb 
  ql/src/test/results/clientnegative/udf_function_does_not_implement_udf.q.out 
ab42da766a 
  ql/src/test/results/clientnegative/udf_local_resource.q.out 62664c9c9a 
  ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out 4751761a8c 
  ql/src/test/results/clientpositive/create_func1.q.out d4afc83144 
  ql/src/test/results/clientpositive/create_genericudaf.q.out ca877bf90d 
  ql/src/test/results/clientpositive/create_genericudf.q.out cfe14f5c4a 
  ql/src/test/results/clientpositive/create_udaf.q.out 8e20b30651 
  ql/src/test/results/clientpositive/drop_udf.q.out 27dd986f21 
  ql/src/test/results/clientpositive/tez/explainanalyze_3.q.out 6925f58af1 
  ql/src/test/results/clientpositive/tez/explainuser_3.q.out 26eae7e8d0 
  ql/src/test/results/clientpositive/udf_compare_java_string.q.out 75d01249d7 
  ql/src/test/results/clientpositive/udf_logic_java_boolean.q.out 0d63db7f55 
  ql/src/test/results/clientpositive/udf_testlength.q.out 23a25d05be 
  ql/src/test/results/clientpositive/udf_testlength2.q.out 1a67685bf5 


Diff: https://reviews.apache.org/r/70929/diff/2/

Changes: https://reviews.apache.org/r/70929/diff/1-2/


Testing
---

All the tests are still running plus added some new q tests.


Thanks,

Miklos Gergely