[jira] [Created] (HIVE-23516) Store hive replication policy execution metrics

2020-05-19 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-23516:
--

 Summary: Store hive replication policy execution metrics
 Key: HIVE-23516
 URL: https://issues.apache.org/jira/browse/HIVE-23516
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23515) Hive2.1.1 use 'union all' merge subquery,the result an exception

2020-05-19 Thread charles (Jira)
charles created HIVE-23515:
--

 Summary: Hive2.1.1 use 'union all' merge subquery,the result an 
exception
 Key: HIVE-23515
 URL: https://issues.apache.org/jira/browse/HIVE-23515
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 2.1.1
Reporter: charles
 Attachments: hive.sql, query-hive-5928.csv, query-hive-5931.csv

hive version:2.1.1

cdh version:6.1.1

when i use 'union all' func to merge subquery into a table,if subquery's order 
is different,the resultset also different,What is the cause of this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23514) Add Atlas metadata replication metrics

2020-05-19 Thread PRAVIN KUMAR SINHA (Jira)
PRAVIN KUMAR SINHA created HIVE-23514:
-

 Summary: Add Atlas metadata replication metrics
 Key: HIVE-23514
 URL: https://issues.apache.org/jira/browse/HIVE-23514
 Project: Hive
  Issue Type: Task
Reporter: PRAVIN KUMAR SINHA
Assignee: PRAVIN KUMAR SINHA






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23513) Fix Json output for SHOW TABLES and SHOW MATERIALIZED VIEWS

2020-05-19 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-23513:
-

 Summary: Fix Json output for SHOW TABLES and SHOW MATERIALIZED 
VIEWS
 Key: HIVE-23513
 URL: https://issues.apache.org/jira/browse/HIVE-23513
 Project: Hive
  Issue Type: Bug
Reporter: Miklos Gergely
Assignee: Miklos Gergely






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23512) ReplDumpTask: Adding debug to print opentxn for debugging perspective

2020-05-19 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23512:
-

 Summary: ReplDumpTask: Adding debug to print opentxn for debugging 
perspective
 Key: HIVE-23512
 URL: https://issues.apache.org/jira/browse/HIVE-23512
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.2.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Often time we see that ReplDumpTask waiting for 
hive.repl.bootstrap.dump.open.txn.timeout  (1h) to kill open txns and make 
progress, the only way to know for what txns it is waiting on is query the 
Metastore DB and backtrack the txns in HS2 logs to know if open txns are 
genuinely open for this long or any other issue.
I am adding the debug log to print these txns which can help in debugging such 
issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23511) percentile_approx throws error when using CTAS statement

2020-05-19 Thread Chris Veregge (Jira)
Chris Veregge created HIVE-23511:


 Summary: percentile_approx throws error when using CTAS statement
 Key: HIVE-23511
 URL: https://issues.apache.org/jira/browse/HIVE-23511
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
 Environment: [vereggcadmin@ip-10-40-51-103 ~]$ hive --version
Hive 2.1.0-amzn-0
Subversion 
git://ip-10-169-254-27/workspace/workspace/bigtop.release-rpm-5.2.0/build/hive/rpm/BUILD/apache-hive-2.1.0-amzn-0-src
 -r 418fa8c602f2a4b153c1a89806305f6b5a27a524
Compiled by ec2-user on Wed Nov 16 03:10:37 UTC 2016
>From source with checksum 64a5b18bfaf894a6b2f1cd14a0654e92

Reporter: Chris Veregge


CTAS statements appear to fail with percentile_approx when using a float array 
as the second argument.

Here's example code that demonstrates the issue.

This statement works
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample;

but adding a CTAS statement to the same query results in an error
create table ptile_table as
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample;

FAILED: UDFArgumentTypeException The second argument must be a constant, but 
array was passed instead.


here's verbose log output including a statment to make the table "sample" which 
is just a column of float values

Logging initialized using configuration in 
file:/etc/hive/conf.dist/hive-log4j2.properties Async: false
set hive.cli.print.header=true
set hive.resultset.use.unique.column.names=false
set hive.exec.parallel=false
set hive.groupby.orderby.position.alias = true
set mapreduce.job.reduce.slowstart.completedmaps = 0.95
set hive.execution.engine=tez
set hive.tez.auto.reducer.parallelism=true
set hive.default.fileformat=orc
set hive.default.fileformat.managed=orc


create table if not exists sample as
select rand() as num
from ucp.dim_date limit 100
OK
Time taken: 0.99 seconds


select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample
Query ID = vereggcadmin_20200519172814_e2cabf47-d8e4-45a9-b5c5-87e323ee8668
Total jobs = 1
Launching Job 1 out of 1
Waiting for Tez session and AM to be ready...


Status: Running (Executing on YARN cluster with App id 
application_1577992969986_117744)

Map 1: 0/1  Reducer 2: 0/1  
Map 1: 0/1  Reducer 2: 0/1  
Map 1: 0(+1)/1  Reducer 2: 0/1  
Map 1: 1/1  Reducer 2: 0/1  
Map 1: 1/1  Reducer 2: 0(+1)/1  
Map 1: 1/1  Reducer 2: 1/1  
OK
ptile
[0.0539687133111435,0.5168283485290134,0.8464088546353761]
Time taken: 14.694 seconds, Fetched: 1 row(s)


create table ptile_table as
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample
FAILED: UDFArgumentTypeException The second argument must be a constant, but 
array was passed instead.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23510) TestMiniLLA

2020-05-19 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-23510:
-

 Summary: TestMiniLLA
 Key: HIVE-23510
 URL: https://issues.apache.org/jira/browse/HIVE-23510
 Project: Hive
  Issue Type: Sub-task
Reporter: Miklos Gergely






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23509) MapJoin AssertionError: Capacity must be power of 2

2020-05-19 Thread Shashank Pedamallu (Jira)
Shashank Pedamallu created HIVE-23509:
-

 Summary: MapJoin AssertionError: Capacity must be power of 2
 Key: HIVE-23509
 URL: https://issues.apache.org/jira/browse/HIVE-23509
 Project: Hive
  Issue Type: Bug
 Environment: Hive-2.3.6
Reporter: Shashank Pedamallu
Assignee: Shashank Pedamallu


Observed AssertionError errors in Hive query when rowCount for join is issued 
as (2^x)+(2^(x+1)).

Following is the stacktrace:
{noformat}
[2020-05-11 05:43:12,135] {base_task_runner.py:95} INFO - Subtask: ERROR : 
Vertex failed, vertexName=Map 4, vertexId=vertex_1588729523139_51702_1_06, 
diagnostics=[Task failed, taskId=task_1588729523139_51702_1_06_001286, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
failure ) : 
attempt_1588729523139_51702_1_06_001286_0:java.lang.RuntimeException: 
java.lang.AssertionError: Capacity must be a power of two [2020-05-11 
05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168) 
[2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
java.security.AccessController.doPrivileged(Native Method) [2020-05-11 
05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
javax.security.auth.Subject.doAs(Subject.java:422) [2020-05-11 05:43:12,136] 
{base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
 [2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [2020-05-11 
05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) [2020-05-11 
05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[2020-05-11 05:43:12,136] {base_task_runner.py:95} INFO - Subtask: at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
java.lang.Thread.run(Thread.java:748) [2020-05-11 05:43:12,137] 
{base_task_runner.py:95} INFO - Subtask: Caused by: java.lang.AssertionError: 
Capacity must be a power of two [2020-05-11 05:43:12,137] 
{base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.validateCapacity(BytesBytesMultiHashMap.java:552)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashImpl(BytesBytesMultiHashMap.java:731)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.expandAndRehashToTarget(BytesBytesMultiHashMap.java:545)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$HashPartition.getHashMapFromDisk(HybridHashTableContainer.java:183)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.reloadHashTable(MapJoinOperator.java:641)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:603)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:539)
 [2020-05-11 05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697) [2020-05-11 
05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711) [2020-05-11 
05:43:12,137] {base_task_runner.py:95} INFO - Subtask: at 

Re: Review Request 72526: HIVE-23493

2020-05-19 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72526/
---

(Updated May 19, 2020, 4:14 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-23493
https://issues.apache.org/jira/browse/HIVE-23493


Repository: hive-git


Description
---

Rewrite plan to join back tables with many projected columns joined multiple 
times


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5a39006d8a 
  data/scripts/q_perf_test_init_constraints.sql 3b3f503ee4 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinRule.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 19ce3ea223 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 32ad4c1539 
  ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query11.q.out 
127003c78b 
  ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query74.q.out 
ee232fa4e3 


Diff: https://reviews.apache.org/r/72526/diff/2/

Changes: https://reviews.apache.org/r/72526/diff/1-2/


Testing
---

mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=cardinality_preserving_join_opt.q,cardinality_preserving_join_opt_q4.q,cardinality_preserving_join_opt_q11.q,cardinality_preserving_join_opt_q74.q
 -pl itests/qtest -Pitests


Thanks,

Krisztian Kasa



Re: Review Request 72526: HIVE-23493

2020-05-19 Thread Krisztian Kasa


> On May 18, 2020, 11:45 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
> > Lines 183 (patched)
> > 
> >
> > It seems you use `projectExpr.getIndex`. I think we should use the 
> > position in the `List<> projectExpressions` rather than the index.

These indices are not always the same:
```
create table if not exists customer
(
c_customer_sk bigint,
c_customer_id int,
c_first_name  string,
c_last_name   string
);

create table store_sales
(
ss_customer_skint,
ss_customer_idint,
ss_quantity   int,
ss_list_price float
);


select ss_customer_sk, ss_customer_id, c_last_name, ss_list_price, 
c_customer_sk, c_customer_id, c_first_name, ss_quantity
from store_sales ss
join customer c on ss.ss_customer_sk = c.c_customer_sk and ss_customer_id = 
c_customer_id;

```

In this case the third element in the project list is the `c_last_name` column 
from the `customer` table which is the 8th element in the rowtype of the input 
of the Project operator. 
`((RexSlot)rootProject.getProjects().get(2)).getIndex()` is `8`


> On May 18, 2020, 11:45 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
> > Lines 269 (patched)
> > 
> >
> > You can just use a Pair here.

That was my first intention but after a while I was totally lost what was 
`left` and `right` means in this context. Then I decided to introduced this 
`ProjectMapping` class with more describing field names.


> On May 18, 2020, 11:45 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
> > Lines 325 (patched)
> > 
> >
> > Do you need to override this method?

I only add collecting the `HiveTableScan` instances in order to copy them when 
the Join operators created.


- Krisztian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72526/#review220811
---


On May 18, 2020, 6:31 p.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72526/
> ---
> 
> (Updated May 18, 2020, 6:31 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-23493
> https://issues.apache.org/jira/browse/HIVE-23493
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  19ce3ea223 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 32ad4c1539 
>   ql/src/test/queries/clientpositive/cardinality_preserving_join_opt.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cardinality_preserving_join_opt_q11.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cardinality_preserving_join_opt_q4.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cardinality_preserving_join_opt_q74.q 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/cardinality_preserving_join_opt.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/cardinality_preserving_join_opt_q11.q.out
>  PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/cardinality_preserving_join_opt_q4.q.out
>  PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/cardinality_preserving_join_opt_q74.q.out
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/72526/diff/1/
> 
> 
> Testing
> ---
> 
> mvn test -Dtest.output.overwrite -DskipSparkTests 
> -Dtest=TestMiniLlapLocalCliDriver 
> -Dqfile=cardinality_preserving_join_opt.q,cardinality_preserving_join_opt_q4.q,cardinality_preserving_join_opt_q11.q,cardinality_preserving_join_opt_q74.q
>  -pl itests/qtest -Pitests
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>



[jira] [Created] (HIVE-23508) Do not show parameters column for non-extended desc database

2020-05-19 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-23508:
-

 Summary: Do not show parameters column for non-extended desc 
database
 Key: HIVE-23508
 URL: https://issues.apache.org/jira/browse/HIVE-23508
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Miklos Gergely
Assignee: Miklos Gergely


The "parameters" column for desc database is only filled if the request is 
"EXTENDED", no point of showing the column otherwise.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23507) Deprecate IOUtils copyBytes

2020-05-19 Thread David Mollitor (Jira)
David Mollitor created HIVE-23507:
-

 Summary: Deprecate IOUtils copyBytes
 Key: HIVE-23507
 URL: https://issues.apache.org/jira/browse/HIVE-23507
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Only used in a single unit test and can easily be replace with o.a.commons



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-05-19 Thread Denys Kuzmenko via Review Board


> On May 19, 2020, 7:31 a.m., Denys Kuzmenko wrote:
> > Ship It!

Zoli, I think you mentioned that hms-benchmark doesn't support running tests in 
multiple threads. Have you tried:  
-T, --threads=   number of concurrent threads


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220820
---


On May 18, 2020, 4:59 p.m., Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated May 18, 2020, 4:59 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M ".*Lock.*" -N 10 -d hive_test --params 10 --params 100 -d hive_test
> 
> This will create N number (10) of locks for first --params number of tables 
> (10) with second --params number of partitions (100) on T (8) threads where 
> each thread will strart an HMS client and it'll use -d (hive_test) database;
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  2ab9388301 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  d80c290b60 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkSuite.java
>  5211082a7d 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  4e75edeae6 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
> HIVE-22869.3.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
> HIVE-22869.4.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/23/423c45d7-911e-4dd2-80b8-c6d3ad90633c__HIVE-22869.4.patch
> HIVE-22869.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/12/a06f3b8c-f4ca-4067-a079-e0b6185266d4__HIVE-22869.5.patch
> HIVE-22869.6.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/15/01254e94-1a8d-496d-ab31-628bd5584193__HIVE-22869.6.patch
> HIVE-22869.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/18/343e8c4e-1f7c-4638-849a-15b448bc2515__HIVE-22869.7.patch
> 
> 
> Thanks,
> 
> Zoltan Chovan
> 
>



[jira] [Created] (HIVE-23506) Move getAcidVersionFrom...File utility methods to TestTxnCommands

2020-05-19 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-23506:


 Summary: Move getAcidVersionFrom...File utility methods to 
TestTxnCommands
 Key: HIVE-23506
 URL: https://issues.apache.org/jira/browse/HIVE-23506
 Project: Hive
  Issue Type: Improvement
Reporter: Karen Coppage
Assignee: Karen Coppage


They're only used in test, and since they contain expensive file accesses, it's 
best to remove the temptation to use them



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23505) Unable to run HivePreparedStatement with "?" in the column name

2020-05-19 Thread Somesh Dhal (Jira)
Somesh Dhal created HIVE-23505:
--

 Summary: Unable to run HivePreparedStatement with "?" in the 
column name
 Key: HIVE-23505
 URL: https://issues.apache.org/jira/browse/HIVE-23505
 Project: Hive
  Issue Type: Bug
 Environment: I am seeing this issue in hive 1.1. Though the version is 
old, looking at the updatesql code at 
[https://github.com/apache/hive/blob/master/jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java]
 it should be reproable in latest versions as well. 

The problem here is that ? is not considered as a literal here but rather as a 
parameter. 
Reporter: Somesh Dhal


I have the below table.

hive> desc spc_tbl;
OK
val?ue int

As you can see I have "?" in the column name. While using the 
HivePreparedStatement with "select `val?ue` from `default`.`spc_tbl`" query the 
execute query throws below SQLException- "parameter #1 is unset"

Please let me know if there is an workaround to overcome this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23504) Propagate query cancellations to druid when a hive query is cancelled

2020-05-19 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-23504:
-

 Summary: Propagate query cancellations to druid when a hive query 
is cancelled
 Key: HIVE-23504
 URL: https://issues.apache.org/jira/browse/HIVE-23504
 Project: Hive
  Issue Type: Bug
  Components: Druid integration, Hive
Reporter: Miklos Gergely
Assignee: Miklos Gergely


See Query cancellation here - 
[https://druid.apache.org/docs/latest/querying/querying.html]

Queries can be cancelled explicitly using their unique identifier. If the query 
identifier is set at the time of query, or is otherwise known, the following 
endpoint can be used on the Broker or Router to cancel the query.

 

{{DELETE /druid/v2/{queryId}}}

 
Copy
For example, if the query ID is {{abc123}}, the query can be cancelled as 
follows:

 

{{curl -X DELETE "http://host:port/druid/v2/abc123"}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 72528: ValidTxnManager doesn't consider txns opened and committed between snapshot generation and locking when evaluating ValidTxnListState

2020-05-19 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72528/
---

Review request for hive, Peter Varga and Peter Vary.


Bugs: HIVE-23503
https://issues.apache.org/jira/browse/HIVE-23503


Repository: hive-git


Description
---

ValidTxnManager doesn't consider txns opened and committed between snapshot 
generation and locking when evaluating ValidTxnListState. This cause issues 
like duplicate insert in case of concurrent merge insert & insert.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java e70c92eef4 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java a8c83fc504 
  ql/src/java/org/apache/hadoop/hive/ql/ValidTxnManager.java 7d49c57dda 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 71afcbdc68 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 0383881acc 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 600289f837 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
8a15b7cc5d 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 65df9c2ba9 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 887d4303f4 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
 312936efa8 
  storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java 
b8ff03f9c4 
  storage-api/src/java/org/apache/hadoop/hive/common/ValidTxnList.java 
d4c3b09730 


Diff: https://reviews.apache.org/r/72528/diff/1/


Testing
---

DbTxnManager tests.


Thanks,

Denys Kuzmenko



[jira] [Created] (HIVE-23503) ValidTxnManager doesn't consider txns opened and committed between snapshot generation and locking when evaluating ValidTxnListState

2020-05-19 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-23503:
-

 Summary: ValidTxnManager doesn't consider txns opened and 
committed between snapshot generation and locking when evaluating 
ValidTxnListState
 Key: HIVE-23503
 URL: https://issues.apache.org/jira/browse/HIVE-23503
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23502) 【hive on spark】 return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

2020-05-19 Thread tom (Jira)
tom created HIVE-23502:
--

 Summary: 【hive on spark】 return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask
 Key: HIVE-23502
 URL: https://issues.apache.org/jira/browse/HIVE-23502
 Project: Hive
  Issue Type: Bug
 Environment: hadoop 2.7.2   hive 1.2.1  sclala 2.9.x   spark 1.3.1
Reporter: tom


Spark UI Log:

 

20/05/19 17:07:11 INFO exec.Utilities: No plan file found: 
hdfs://mycluster/tmp/hive/root/a3b20597-61d1-47a9-86b1-dde289fded78/hive_2020-05-19_17-06-53_394_4024151029162597012-1/-mr-10003/c586ae6a-eefb-49fd-92b6-7593e57f0a93/map.xml
20/05/19 17:07:11 ERROR executor.Executor: Exception in task 0.0 in stage 0.0 
(TID 0)
java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
 at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:236)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:212)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:64)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
20/05/19 17:07:11 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
1
20/05/19 17:07:11 INFO executor.Executor: Running task 0.1 in stage 0.0 (TID 1)
20/05/19 17:07:11 INFO rdd.HadoopRDD: Input split: 
Paths:/user/hive/warehouse/orginfobig_fq/nd=2014/frcode=410503/fqdate=2014-01-01/part-m-0:0+100InputFormatClass:
 org.apache.hadoop.mapred.TextInputFormat

20/05/19 17:07:11 INFO exec.Utilities: No plan file found: 
hdfs://mycluster/tmp/hive/root/a3b20597-61d1-47a9-86b1-dde289fded78/hive_2020-05-19_17-06-53_394_4024151029162597012-1/-mr-10003/c586ae6a-eefb-49fd-92b6-7593e57f0a93/map.xml
20/05/19 17:07:11 ERROR executor.Executor: Exception in task 0.1 in stage 0.0 
(TID 1)
java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
 at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:236)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:212)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:64)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
20/05/19 17:19:19 INFO storage.BlockManager: Removing broadcast 1
20/05/19 17:19:19 INFO storage.BlockManager: Removing block broadcast_1
20/05/19 17:19:19 INFO storage.MemoryStore: Block broadcast_1 of size 189144 
dropped from memory (free 1665525606)
20/05/19 17:19:19 INFO storage.BlockManager: Removing block broadcast_1_piece0
20/05/19 17:19:19 INFO storage.MemoryStore: Block broadcast_1_piece0 of size 
55965 dropped from memory (free 1665581571)
20/05/19 17:19:19 INFO storage.BlockManagerMaster: Updated info of block 
broadcast_1_piece0
20/05/19 17:19:19 INFO storage.BlockManager: Removing broadcast 0
20/05/19 17:19:19 INFO storage.BlockManager: Removing block 

[jira] [Created] (HIVE-23501) AOOB in VectorDeserializeRow when complex types are converted to primitive types

2020-05-19 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-23501:
---

 Summary: AOOB in VectorDeserializeRow when complex types are 
converted to primitive types
 Key: HIVE-23501
 URL: https://issues.apache.org/jira/browse/HIVE-23501
 Project: Hive
  Issue Type: Bug
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


AOOB in VectorDeserializeRow when complex types are converted to primitive types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23500) [Kubernetes] Use Extend NodeId for LLAP registration

2020-05-19 Thread Attila Magyar (Jira)
Attila Magyar created HIVE-23500:


 Summary: [Kubernetes] Use Extend NodeId for LLAP registration
 Key: HIVE-23500
 URL: https://issues.apache.org/jira/browse/HIVE-23500
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Attila Magyar
Assignee: Attila Magyar
 Fix For: 4.0.0


In kubernetes environment where pods can have same host name and port, there 
can be situations where node trackers could be retaining old instance of the 
pod in its cache. In case of Hive LLAP, where the llap tez task scheduler 
maintains the membership of nodes based on zookeeper registry events there can 
be cases where NODE_ADDED followed by NODE_REMOVED event could end up removing 
the node/host from node trackers because of stable hostname and service port. 
The NODE_REMOVED event in this case is old stale event of the already dead pod 
but ZK will send only after session timeout (in case of non-graceful shutdown). 
If this sequence of events happen, a node/host is completely lost form the 
schedulers perspective. 

To support this scenario, tez can extend yarn's NodeId to include 
uniqueIdentifier. Llap task scheduler can construct the container object with 
this new NodeId that includes uniqueIdentifier as well so that stale events 
like above will only remove the host/node that matches the old 
uniqueIdentifier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-19 Thread Karen Coppage via Review Board


> On May 18, 2020, 12:51 p.m., Peter Vary wrote:
> > Minor comments only.
> > Thanks for the patch!

Thanks for the review!!


> On May 18, 2020, 12:51 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 305 (patched)
> > 
> >
> > Migth want to add asserts here to check non-null argument
> 
> Karen Coppage wrote:
> I think if the StorageDescriptor is null, a NPE *should* be thrown 
> because that would be a huge problem, "non-null" is in the JavaDoc contract, 
> this method is used once, and will make  it a private method in the next 
> patch.
> 
> Peter Vary wrote:
> Since it become private, it is even better! :)
> Normally for public methods I perfer using, so if someone tries to use 
> this then easier to identify what went wrong. Like this:
> ```
> assert sd != null : "Non-null sd is required"
> ```

I see. Thanks!


- Karen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220805
---


On May 19, 2020, 5:58 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated May 19, 2020, 5:58 a.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b7 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  b9db1d1bb98 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  89920ccebf4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 9410a963518 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> c70d4f33a80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 4d0e5f703e7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  724a4375b75 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1cd95f80155 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58af 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72480: HIVE-23242 Fix flaky tests testHouseKeepingThreadExistence

2020-05-19 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72480/#review220824
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Line 10130 (original), 10130 (patched)


Could be please add a javadoc comment here. I think it is especially 
important as startedBackGroundThreads is a test only parameterer. (maybe rename 
to startedBackgroundThread)



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 10468 (patched)


nit: extra space



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
Lines 234 (patched)


Question: Is this log line printed when the HMS is started, but the threads 
are not yet stated? Maybe extend the log line with this info?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
Lines 242 (patched)


Question: Is this again the case when the HMS is started, but the HK 
threads are not started? Maybe extend the log line that the HMS is started?


- Peter Vary


On máj. 8, 2020, 9:46 de, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72480/
> ---
> 
> (Updated máj. 8, 2020, 9:46 de)
> 
> 
> Review request for hive, Miklos Gergely and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fix the timing to avoid flakyness.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreHousekeepingLeaderTestBase.java
>  a39a9c8e04 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreTaskThreadAlwaysTestImpl.java
>  4cd2c58896 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl1.java
>  c590b6aad5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl2.java
>  5b50f66c51 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeader.java
>  03a8161ea4 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeaderEmptyConfig.java
>  75ea637503 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingNonLeader.java
>  0341d3c03b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  7bba8d6ee6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
>  2702e69f86 
> 
> 
> Diff: https://reviews.apache.org/r/72480/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-19 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220822
---


Ship it!




Ship It!

- Peter Vary


On máj. 19, 2020, 5:58 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated máj. 19, 2020, 5:58 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b7 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  b9db1d1bb98 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  89920ccebf4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 9410a963518 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> c70d4f33a80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 4d0e5f703e7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  724a4375b75 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1cd95f80155 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58af 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-19 Thread Peter Vary via Review Board


> On máj. 18, 2020, 12:51 du, Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 305 (patched)
> > 
> >
> > Migth want to add asserts here to check non-null argument
> 
> Karen Coppage wrote:
> I think if the StorageDescriptor is null, a NPE *should* be thrown 
> because that would be a huge problem, "non-null" is in the JavaDoc contract, 
> this method is used once, and will make  it a private method in the next 
> patch.

Since it become private, it is even better! :)
Normally for public methods I perfer using, so if someone tries to use this 
then easier to identify what went wrong. Like this:
```
assert sd != null : "Non-null sd is required"
```


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220805
---


On máj. 19, 2020, 5:58 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated máj. 19, 2020, 5:58 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b7 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  b9db1d1bb98 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  89920ccebf4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 9410a963518 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> c70d4f33a80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 4d0e5f703e7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  724a4375b75 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1cd95f80155 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58af 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-05-19 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220820
---


Ship it!




Ship It!

- Denys Kuzmenko


On May 18, 2020, 4:59 p.m., Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated May 18, 2020, 4:59 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M ".*Lock.*" -N 10 -d hive_test --params 10 --params 100 -d hive_test
> 
> This will create N number (10) of locks for first --params number of tables 
> (10) with second --params number of partitions (100) on T (8) threads where 
> each thread will strart an HMS client and it'll use -d (hive_test) database;
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  2ab9388301 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  d80c290b60 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkSuite.java
>  5211082a7d 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  4e75edeae6 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
> HIVE-22869.3.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
> HIVE-22869.4.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/23/423c45d7-911e-4dd2-80b8-c6d3ad90633c__HIVE-22869.4.patch
> HIVE-22869.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/12/a06f3b8c-f4ca-4067-a079-e0b6185266d4__HIVE-22869.5.patch
> HIVE-22869.6.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/15/01254e94-1a8d-496d-ab31-628bd5584193__HIVE-22869.6.patch
> HIVE-22869.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/18/343e8c4e-1f7c-4638-849a-15b448bc2515__HIVE-22869.7.patch
> 
> 
> Thanks,
> 
> Zoltan Chovan
> 
>



Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-05-19 Thread Denys Kuzmenko via Review Board


> On May 19, 2020, 7:28 a.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
> > Line 24 (original), 26 (patched)
> > 
> >
> > I can still see wildcard import

however it's fixed in a latest patch.


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220818
---


On May 18, 2020, 4:59 p.m., Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated May 18, 2020, 4:59 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M ".*Lock.*" -N 10 -d hive_test --params 10 --params 100 -d hive_test
> 
> This will create N number (10) of locks for first --params number of tables 
> (10) with second --params number of partitions (100) on T (8) threads where 
> each thread will strart an HMS client and it'll use -d (hive_test) database;
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  2ab9388301 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  d80c290b60 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkSuite.java
>  5211082a7d 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  4e75edeae6 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
> HIVE-22869.3.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
> HIVE-22869.4.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/23/423c45d7-911e-4dd2-80b8-c6d3ad90633c__HIVE-22869.4.patch
> HIVE-22869.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/12/a06f3b8c-f4ca-4067-a079-e0b6185266d4__HIVE-22869.5.patch
> HIVE-22869.6.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/15/01254e94-1a8d-496d-ab31-628bd5584193__HIVE-22869.6.patch
> HIVE-22869.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/18/343e8c4e-1f7c-4638-849a-15b448bc2515__HIVE-22869.7.patch
> 
> 
> Thanks,
> 
> Zoltan Chovan
> 
>



Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-05-19 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220818
---




standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
Line 24 (original), 26 (patched)


I can still see wildcard import



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
Line 40 (original), 37 (patched)


same here


- Denys Kuzmenko


On May 18, 2020, 4:59 p.m., Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated May 18, 2020, 4:59 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M ".*Lock.*" -N 10 -d hive_test --params 10 --params 100 -d hive_test
> 
> This will create N number (10) of locks for first --params number of tables 
> (10) with second --params number of partitions (100) on T (8) threads where 
> each thread will strart an HMS client and it'll use -d (hive_test) database;
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  2ab9388301 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  d80c290b60 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkSuite.java
>  5211082a7d 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  4e75edeae6 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
> HIVE-22869.3.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
> HIVE-22869.4.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/23/423c45d7-911e-4dd2-80b8-c6d3ad90633c__HIVE-22869.4.patch
> HIVE-22869.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/12/a06f3b8c-f4ca-4067-a079-e0b6185266d4__HIVE-22869.5.patch
> HIVE-22869.6.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/15/01254e94-1a8d-496d-ab31-628bd5584193__HIVE-22869.6.patch
> HIVE-22869.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/05/18/343e8c4e-1f7c-4638-849a-15b448bc2515__HIVE-22869.7.patch
> 
> 
> Thanks,
> 
> Zoltan Chovan
> 
>