[jira] [Created] (HIVE-22457) Hive MetaStore CachedStore UpdateCache Error

2019-11-04 Thread fanshilun (Jira)
fanshilun created HIVE-22457:


 Summary: Hive MetaStore CachedStore UpdateCache Error
 Key: HIVE-22457
 URL: https://issues.apache.org/jira/browse/HIVE-22457
 Project: Hive
  Issue Type: Bug
Reporter: fanshilun


2019-11-05T05:14:58,408 INFO [CachedStore-CacheUpdateService: Thread-22] 
cached2.CachedStore: Update Prewarm
2019-11-05T05:30:59,034 ERROR [CachedStore-CacheUpdateService: Thread-22] 
bonecp.ConnectionHandle: Database access problem. Killing off this connection 
and all remaining connections in the connection pool. SQL State = 08S01

CachedStore UpdateThread Exit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22456) When one branch from inner join produces max 0 rows, then we can introduce limit 0 on the other branch.

2019-11-04 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-22456:
---

 Summary: When one branch from inner join produces max 0 rows, then 
we can introduce limit 0 on the other branch.
 Key: HIVE-22456
 URL: https://issues.apache.org/jira/browse/HIVE-22456
 Project: Hive
  Issue Type: Improvement
Reporter: Steve Carlin


Exactly as the title says.

This can be found in infer_join_preds.q.out



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22455) Union branch removal rule does not kick in.

2019-11-04 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-22455:
---

 Summary: Union branch removal rule does not kick in.
 Key: HIVE-22455
 URL: https://issues.apache.org/jira/browse/HIVE-22455
 Project: Hive
  Issue Type: Improvement
Reporter: Steve Carlin


After the Calcite upgrade to 1.21, there is a rule where 2 branches of a union 
have limit 0. This can be simplified.

This can be found in: union_assertion_type.q.out



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22454) Remove subplan under "Limit 0"

2019-11-04 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-22454:
---

 Summary: Remove subplan under "Limit 0"
 Key: HIVE-22454
 URL: https://issues.apache.org/jira/browse/HIVE-22454
 Project: Hive
  Issue Type: Improvement
Reporter: Steve Carlin


After the Calcite upgrade and some fixes, there are plans that have a "limit 0" 
but then do stuff after 0 rows are selected.

An example is in filter_union.q.out:
...

      Reduce Operator Tree:

        Group By Operator

          aggregations: count(VALUE._col0)

          keys: KEY._col0 (type: string)

          mode: mergepartial

          outputColumnNames: _col0, _col1

          Statistics: Num rows: 250 Data size: 23750 Basic stats: COMPLETE 
Column stats: COMPLETE

          Limit

            Number of rows: 0
...
We can simplify the plan by maybe creating a dummy table after the Limit 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71645: HIVE-22292

2019-11-04 Thread Jesús Camacho Rodríguez


> On Nov. 1, 2019, 6:06 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java
> > Line 30 (original), 30 (patched)
> > 
> >
> > Does _orderedAggregate_ duplicate the functionality of _impliesOrder_?
> 
> Krisztian Kasa wrote:
> `impliesOrder` indicates that the function is a window function and an 
> OVER clause is required when invoked. example:
> ```
> SELECT val, rank() OVER (ORDER BY val DESC) FROM t_table;
> ```
> 
> `orderedAggregate` means that the function is an Ordered-Set Aggregate or 
> a Hypothetical-Set Aggregate function and a `WITHIN GROUP` clause is required 
> when invoked:
> ```
> SELECT rank(1) WITHIN GROUP (ORDER BY val) FROM t_table;
> ```
> 
> When checking the semantic if the WITHIN GROUP keywords were provided an 
> extra check is added: whether the function allows using `WITHIN GROUP`: is it 
> an Ordered-Set Aggregate or a Hypothetical-Set Aggregate function or not. 
> If no WITHIN GROUP nor ORDER clause were specified however the function 
> can not be used wihout them a MISSING_OVER_CLAUSE exception will be thrown.

Can we add this as a comment to the code for clarity? The names are a bit 
confusing at this point. Thanks


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/#review218480
---


On Nov. 4, 2019, 11:01 a.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71645/
> ---
> 
> (Updated Nov. 4, 2019, 11:01 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and 
> Vineet Garg.
> 
> 
> Bugs: HIVE-22292
> https://issues.apache.org/jira/browse/HIVE-22292
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement Hypothetical-Set Aggregate Functions
> ==
> 1. rank, dense_rank, precent_rank, cume_dist
> 2. Allow unlimited column references in `WITHIN GROUP` clause
> 3. Refactor the implementation of the functions `percentile_cont` and 
> `percentile_disc`: 
>  - validate that only one parameter and column reference is passed to 
> these two functions. 
>  - since the semantics of the `WITHIN GROUP` clause allows multiple 
> column references the parameter order had to be changed and this affect 
> backward compatibility.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
> 48645dc3f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java 
> a0b0e48f4c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 0198c0f724 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
> d0c155ff2d 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
> 992f5bfd21 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
> 64e9c8b7ca 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
>  ad61410180 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
>  c8d3c12c80 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
> 13e2f537cd 
>   ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
> dead3ec472 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
> 9d44ed87e9 
>   ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
>   ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 
> 
> 
> Diff: https://reviews.apache.org/r/71645/diff/3/
> 
> 
> Testing
> ---
> 
> New q test added for testing Hypothetical-Set Aggregate Functions: 
> hypothetical_set_aggregates.q
> Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
> udaf_percentile_disc.q
> Run unit test: TestParseWithinGroupClause.java
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>



[jira] [Created] (HIVE-22453) Describe table unnecessarily fetches partitions

2019-11-04 Thread Toshihiko Uchida (Jira)
Toshihiko Uchida created HIVE-22453:
---

 Summary: Describe table unnecessarily fetches partitions
 Key: HIVE-22453
 URL: https://issues.apache.org/jira/browse/HIVE-22453
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.3.6, 3.1.2
Reporter: Toshihiko Uchida


The simple describe table command without EXTENDED and FORMATTED (i.e., 
DESCRIBE table_name) fetches all partitions when no partition is specified, 
although it does not display partition statistics in nature.
The command should not fetch partitions since it can take a long time for a 
large amount of partitions.
For instance, in our environment, the command takes around 8 seconds for a 
table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22452) CTAS query failure at DDL task stage doesn't clean out the target directory

2019-11-04 Thread Riju Trivedi (Jira)
Riju Trivedi created HIVE-22452:
---

 Summary: CTAS query failure at DDL task stage doesn't clean out 
the target directory
 Key: HIVE-22452
 URL: https://issues.apache.org/jira/browse/HIVE-22452
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.2, 3.1.0
Reporter: Riju Trivedi
Assignee: Riju Trivedi


CTAS query failure at DDL task stage due to HMS connection issue leaves the 
output file in target directory. Since DDL task stage happens after Tez DAG 
completion and MOVE Task , output file gets  already moved to target directory 
and does not get cleaned up after the query failure.

Re-executing the same query causes a duplicate file under table location hence 
duplicate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71705: HIVE-22420: DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71705/#review218491
---



LGTM, just minor comments. Also you mentioned unit test, do not see it in a 
patch.

- Denys Kuzmenko


On Nov. 4, 2019, 10:02 a.m., Aron Hamvas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71705/
> ---
> 
> (Updated Nov. 4, 2019, 10:02 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
> Peter Vary.
> 
> 
> Bugs: HIVE-22420
> https://issues.apache.org/jira/browse/HIVE-22420
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DbTxnManager.stopHeartbeat() should be thread-safe, rollbackTxn() should 
> check before calling MS client, if txn has been closed previously, to avoid 
> sending 0 as txnId.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
> 
> 
> Diff: https://reviews.apache.org/r/71705/diff/2/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Aron Hamvas
> 
>



Re: Review Request 71705: HIVE-22420: DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71705/#review218490
---




ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
Lines 539 (patched)


should it be thread safe?



ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
Lines 677 (patched)


Magic number. Why exactly 31 sec?


- Denys Kuzmenko


On Nov. 4, 2019, 10:02 a.m., Aron Hamvas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71705/
> ---
> 
> (Updated Nov. 4, 2019, 10:02 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
> Peter Vary.
> 
> 
> Bugs: HIVE-22420
> https://issues.apache.org/jira/browse/HIVE-22420
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DbTxnManager.stopHeartbeat() should be thread-safe, rollbackTxn() should 
> check before calling MS client, if txn has been closed previously, to avoid 
> sending 0 as txnId.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
> 
> 
> Diff: https://reviews.apache.org/r/71705/diff/2/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Aron Hamvas
> 
>



Re: Review Request 71645: HIVE-22292

2019-11-04 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

(Updated Nov. 4, 2019, 11:01 a.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0198c0f724 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/3/

Changes: https://reviews.apache.org/r/71645/diff/2-3/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa



Re: Review Request 71705: HIVE-22420: DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Aron Hamvas via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71705/
---

(Updated Nov. 4, 2019, 10:02 a.m.)


Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
Peter Vary.


Changes
---

Do not allow starting multiple heartbeater tasks.


Bugs: HIVE-22420
https://issues.apache.org/jira/browse/HIVE-22420


Repository: hive-git


Description
---

DbTxnManager.stopHeartbeat() should be thread-safe, rollbackTxn() should check 
before calling MS client, if txn has been closed previously, to avoid sending 0 
as txnId.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 


Diff: https://reviews.apache.org/r/71705/diff/2/

Changes: https://reviews.apache.org/r/71705/diff/1-2/


Testing
---

unit test


Thanks,

Aron Hamvas



[jira] [Created] (HIVE-22451) Secure LLAP configurations are still deemed unsecure in Tez AM processes

2019-11-04 Thread Jira
Ádám Szita created HIVE-22451:
-

 Summary: Secure LLAP configurations are still deemed unsecure in 
Tez AM processes
 Key: HIVE-22451
 URL: https://issues.apache.org/jira/browse/HIVE-22451
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Ádám Szita


Due to the change in HIVE-22354 and HIVE-22195 Zookeeper discovery of LLAP 
workers is not working when invoked from within a Tez AM process: a Tez AM 
process does not log on using Kerberos even in secure environments, hence
{code:java}
 UserGroupInformation.getLoginUser().hasKerberosCredentials() {code}
will return false for security-enabled clusters too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)