Re: Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-10 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

(Updated July 11, 2019, 5:02 a.m.)


Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 437266355a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d47457857c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 


Diff: https://reviews.apache.org/r/71044/diff/2/

Changes: https://reviews.apache.org/r/71044/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected

2019-07-10 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/#review216516
---




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
Line 255 (original), 255 (patched)


Can we update this comment since it is not only the big table? Feel free to 
add any more info to understand better what is going on.



ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q
Line 10 (original), 9 (patched)


Why is this disabled now? This is causing map join conversion to not being 
triggered below.



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 83 (original), 84 (patched)


Map Join conversion not being triggered.



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 241 (original), 255 (patched)


Map Join conversion not being triggered.



ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out
Line 5149 (original), 5149 (patched)


Cool.


- Jesús Camacho Rodríguez


On July 9, 2019, 4:12 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71040/
> ---
> 
> (Updated July 9, 2019, 4:12 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-21923
> https://issues.apache.org/jira/browse/HIVE-21923
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21923
> Vectorized MapJoin may miss results when only the join key is selected
> 
> 
> Diffs
> -
> 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  70ee4266f45219fd81bf0d0df0a2c4380334e307 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
>  35db844f236f24d2f17f4a43d064c9ebaf8c 
>   ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q 
> d989ca7dc883fa071cf5772f358c68bff78f659f 
>   ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 
> 45a646c948ec8b72710a6b8a3949fbe0203dd68e 
>   ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
> 2305f87e45bd65152a6c77ce04f7b8efad4724d7 
>   ql/src/test/results/clientpositive/spark/auto_join14.q.out 
> 0c80c13889d134abe82bde30c98300620b1fd432 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 
> 4ee669fa7dd50e0373910030b35c8860383a3a70 
>   ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 
> e28b15044503ea4bb5bd12b7caed6b105f337efd 
> 
> 
> Diff: https://reviews.apache.org/r/71040/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>



[jira] [Created] (HIVE-21986) HiveServer Web UI: Setting the Strict-Transport-Security in default response header

2019-07-10 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21986:
-

 Summary: HiveServer Web UI: Setting the Strict-Transport-Security 
in default response header
 Key: HIVE-21986
 URL: https://issues.apache.org/jira/browse/HIVE-21986
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Currently, HiveServer UI HTTP response header doesn't have 
Strict-Transport-Security set so will be adding this to default header.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21985) LLAP IO: Log schema evolution incompatibilities at WARN level always

2019-07-10 Thread Gopal V (JIRA)
Gopal V created HIVE-21985:
--

 Summary: LLAP IO: Log schema evolution incompatibilities at WARN 
level always
 Key: HIVE-21985
 URL: https://issues.apache.org/jira/browse/HIVE-21985
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


When reading incompatible schema, LLAP IO simply skips over the file and does 
not cache it.

The logging at WARN level would be useful and simplify the root-cause via logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21984) Clean up TruncateTable operation and desc

2019-07-10 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-21984:
-

 Summary: Clean up TruncateTable operation and desc
 Key: HIVE-21984
 URL: https://issues.apache.org/jira/browse/HIVE-21984
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.1.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


TruncateTableDesc is not immutable, should be fixed. The DDLSemanticAnalyzer 
has a long and complicated function for parsing truncate commands, should e cut 
to pieces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21983) Cut DropTableDesc/Operation to drop table, view and materialized view

2019-07-10 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-21983:
-

 Summary: Cut DropTableDesc/Operation to drop table, view and 
materialized view
 Key: HIVE-21983
 URL: https://issues.apache.org/jira/browse/HIVE-21983
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.1.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 71038: HIVE-21965

2019-07-10 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71038/
---

Review request for hive, Jesús Camacho Rodríguez, Jason Dere, Zoltan Haindrich, 
and Vineet Garg.


Bugs: HIVE-21965
https://issues.apache.org/jira/browse/HIVE-21965


Repository: hive-git


Description
---

Implement parallel processing in HiveStrictManagedMigration
===

1. Process databases and tables paralelly using a thread pool for each. Thread 
pools size can be defined by command line options. If no options are given the 
default pool size is the number of cpu cores / 2
2. Add option for filtering table type.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/util/CloseableThreadLocal.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/util/HiveStrictManagedMigration.java 
80025b7046 
  
ql/src/java/org/apache/hadoop/hive/ql/util/NamedForkJoinWorkerThreadFactory.java
 PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 5f39fdccb5 
  ql/src/test/org/apache/hadoop/hive/ql/util/CloseableThreadLocalTest.java 
PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/util/TestHiveStrictManagedMigration.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/71038/diff/1/


Testing
---

New UT for testing migration.


Thanks,

Krisztian Kasa



[jira] [Created] (HIVE-21982) hive does not use stats even after analyzing the table

2019-07-10 Thread shin chen (JIRA)
shin chen created HIVE-21982:


 Summary: hive does not use stats even after analyzing the table
 Key: HIVE-21982
 URL: https://issues.apache.org/jira/browse/HIVE-21982
 Project: Hive
  Issue Type: Bug
  Components: Hive
 Environment: HDP

Hive 1.2.1000.2.6.5.0-292
Reporter: shin chen


 

setting:
{code:java}
hive.cbo.enable=true;
hive.compute.query.using.stats=true;
hive.stats.fetch.column.stats=true;
hive.stats.fetch.partition.stats=true;
hive.vectorized.execution.enabled =true;
hive.vectorized.execution.reduce.enabled = true;
{code}
{code:java}
// desc extended **.** partition(month=**,day=**,hour=**);
. parameters:{transient_lastDdlTime=1561958282, totalSize=16413917810, 
numFiles=3}
{code}
This table is not analyzed yet, so scan the table when a simple query executed.
{code:java}
// code placeholder
SELECT count(*) FROM **.** WHERE month='**' AND day='**' AND hour='**';
 1 row selected (52.756 seconds){code}
After analyzing the table
{code:java}
// Analyze first
analyze table **.** partition(month='**',day='**',hour='**') compute statistics;
// Then runs the last count(*) query
SELECT count(*) FROM **.** WHERE month='**' AND day='**' AND hour='**';
 1 row selected (58.326 seconds){code}
Hive does not use the metadata in stats

Describe the table again:
{code:java}

parameters:{totalSize=16413917811, numRows=37975264, rawDataSize=4670957472, 
COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=3, 
transient_lastDdlTime=1562669873})
{code}
Any advice here?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 71045: HIVE-21948

2019-07-10 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71045/
---

Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-21948
https://issues.apache.org/jira/browse/HIVE-21948


Repository: hive-git


Description
---

Implement parallel processing in Pre Upgrade Tool
=
Process databases and tables paralelly using a thread pool for each. Thread 
pools size can be defined by command line options. If no options are given the 
default pool size is the number of cpu cores / 2


Diffs
-

  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/CloseableThreadLocal.java
 PRE-CREATION 
  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/CompactTablesState.java
 PRE-CREATION 
  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/CompactionMetaInfo.java
 PRE-CREATION 
  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/NamedForkJoinWorkerThreadFactory.java
 PRE-CREATION 
  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/PreUpgradeTool.java
 0a7354d12b 
  
upgrade-acid/pre-upgrade/src/main/java/org/apache/hadoop/hive/upgrade/acid/RunOptions.java
 66213d424c 
  
upgrade-acid/pre-upgrade/src/test/java/org/apache/hadoop/hive/upgrade/acid/TestCloseableThreadLocal.java
 PRE-CREATION 
  
upgrade-acid/pre-upgrade/src/test/java/org/apache/hadoop/hive/upgrade/acid/TestRunOptions.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/71045/diff/1/


Testing
---

TestPreUpgradeTool UT passed.

Manually: 
1. deploy ambari and HDP 2.6.5 cluster.
2. create some hive tables and insert/update values
3. run PreUpgradeTool and check logs and output script


Thanks,

Krisztian Kasa



[jira] [Created] (HIVE-21981) When LlapDaemon capacity is set to 0 and the waitqueue is not empty then the queries are stuck

2019-07-10 Thread Peter Vary (JIRA)
Peter Vary created HIVE-21981:
-

 Summary: When LlapDaemon capacity is set to 0 and the waitqueue is 
not empty then the queries are stuck
 Key: HIVE-21981
 URL: https://issues.apache.org/jira/browse/HIVE-21981
 Project: Hive
  Issue Type: Sub-task
  Components: llap
Reporter: Peter Vary
Assignee: Peter Vary


When an LlapDaemon executor capacity is set to 0 then the already queued tasks 
are not handled causing the queries to stuck



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-10 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1346bed5a7 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7c58072413 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 


Diff: https://reviews.apache.org/r/71044/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21980) Parsing time can be high in case of deeply nested subqueries

2019-07-10 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-21980:
---

 Summary: Parsing time can be high in case of deeply nested 
subqueries
 Key: HIVE-21980
 URL: https://issues.apache.org/jira/browse/HIVE-21980
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich
 Attachments: HIVE-21980.01.patch

for queries which are recursively doing:

{code}
SELECT ...
FROM (SELECT ...
 FROM ( [...]
 ) JOIN
(SELECT ...
FROM ( [...] )
JOIN ( [...]
)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21979) TestReplication tests time out regularily

2019-07-10 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-21979:
---

 Summary: TestReplication tests time out regularily
 Key: HIVE-21979
 URL: https://issues.apache.org/jira/browse/HIVE-21979
 Project: Hive
  Issue Type: Improvement
  Components: Test
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I think we should add TestTableLevelReplicationScenarios and friends to be 
executed in isolation

from a recent ptest execution:
{code}
[INFO] Running org.apache.hadoop.hive.ql.TestCreateUdfEntities
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 150.413 
s - in org.apache.hadoop.hive.ql.TestCreateUdfEntities
[INFO] Running 
org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithReplication
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.084 s 
- in org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithReplication
[INFO] Running org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.062 s 
- in org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez
[INFO] Running org.apache.hadoop.hive.ql.TestWarehouseExternalDir
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.568 s 
- in org.apache.hadoop.hive.ql.TestWarehouseExternalDir
[INFO] Running org.apache.hadoop.hive.ql.parse.TestReplicationOfHiveStreaming
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 348.769 
s - in org.apache.hadoop.hive.ql.parse.TestReplicationOfHiveStreaming
[INFO] Running org.apache.hadoop.hive.ql.parse.TestExportImport
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.089 s 
- in org.apache.hadoop.hive.ql.parse.TestExportImport
[INFO] Running 
org.apache.hadoop.hive.ql.parse.TestTableLevelReplicationScenarios
[INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1,044.666 s - in 
org.apache.hadoop.hive.ql.parse.TestTableLevelReplicationScenarios
[INFO] Running 
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosExternalTables
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 225.734 
s - in org.apache.hadoop.hive.ql.parse.TestReplicationScenariosExternalTables
[INFO] Running 
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)