date:20200114

[jira] [Created] (HIVE-22728) Limit the scope of uniqueness of constraint name to database

2020-01-14 Thread Jesus Camacho Rodriguez (Jira)

Jesus Camacho Rodriguez created HIVE-22728:
--

 Summary: Limit the scope of uniqueness of constraint name to 
database
 Key: HIVE-22728
 URL: https://issues.apache.org/jira/browse/HIVE-22728
 Project: Hive
  Issue Type: Wish
Reporter: Jesus Camacho Rodriguez


Currently, constraint names are globally unique across all databases 
(assumption is that this may have done by design). Nevertheless, though 
behavior seems to be implementation specific, it would be interesting to limit 
the scope to uniqueness per database.
Currently we do not store database information with the constraints. To change 
the scope to one db, we would need to store the DB_ID in the KEY_CONSTRAINTS 
table in metastore when we create a constraint and add the DB_ID to the PRIMARY 
KEY of that table. Some minor changes to the error messages would be needed 
too, since otherwise it would be difficult to identify the correct violation in 
queries that span across multiple databases. Additionally, the SQL scripts will 
need to be updated to populate the DB_ID when we upgrade to new version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Review Request 71761: HIVE-22489

2020-01-14 Thread Krisztian Kasa


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71761/
---

(Updated Jan. 14, 2020, 8:39 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Zoltan Haindrich.


Bugs: HIVE-22489
https://issues.apache.org/jira/browse/HIVE-22489


Repository: hive-git


Description
---

Reduce Sink operator orders nulls first
===
1. Set the default null sort order by hive config when creating Reduce Sink 
Desc.
2. Hash join uses 
`org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableSerializeWrite`
 or `BinarySortableDeserializeRead` for selializing keys. For bigtable keys 
always ascending and nulls first ordering was hardcoded. This patch changes 
this behaviour to use the `Operator.getConf().TableDesc.getProperties()` (in 
this case `MapJoinOperator`) to setup ordering in `BinarySortableSerializeWrite`
3. Use null ordering set in ReduceRecordSource at Reduce phase when comparing 
keys in `CommonMergeJoinOperator` (This is the null ordering of the children 
Reduce Sink operators)


Diffs (updated)
-

  accumulo-handler/src/test/results/positive/accumulo_queries.q.out 7c552621f2 
  contrib/src/test/results/clientpositive/udaf_example_group_concat.q.out 
6846720d95 
  hbase-handler/src/test/results/positive/hbase_queries.q.out a32ef81a7b 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 e997fa65cf 
  kudu-handler/src/test/results/positive/kudu_complex_queries.q.out 73fc3e514f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 
3974627a24 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
72446afeda 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
 2380d936f2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyMultiKeyOperator.java
 f587517b08 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerMultiKeyOperator.java
 cdee3fd957 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinLeftSemiMultiKeyOperator.java
 e5d9fdae19 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinOuterMultiKeyOperator.java
 29c531bd51 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMap.java
 a4cda921a5 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMultiSet.java
 43f093d906 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashSet.java
 8dce5b82d3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java
 a35401d9b2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringCommon.java
 1b108a8c14 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMap.java
 446feb2526 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMultiSet.java
 c28ef9be2b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashSet.java
 17bd5fda93 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
 4ab8902a3f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedCreateHashTable.java
 21c355cb42 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongCommon.java
 de1ee15c3b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMap.java
 42573f0898 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMultiSet.java
 829a03737d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashSet.java
 18e1435019 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringCommon.java
 da0e8365b1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMap.java
 6c4d8a81d1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMultiSet.java
 a6b754c7eb 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashSet.java
 fdcd83dde7 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
 5c409e4573 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CountDistinctRewriteProc.java 
a50ad78e8f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 0f95d7788c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
89b55001f0 
  
ql/src/java/org/apache/hadoop/hi

[jira] [Created] (HIVE-22727) Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts

2020-01-14 Thread Zoltan Chovan (Jira)

Zoltan Chovan created HIVE-22727:


 Summary: Add hive db schema changes introduced in HIVE-21884 to 
the schema upgrade scripts
 Key: HIVE-22727
 URL: https://issues.apache.org/jira/browse/HIVE-22727
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Chovan
Assignee: Zoltan Chovan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Review Request 71995: TopN Key optimizer should use array instead of priority queue

2020-01-14 Thread Attila Magyar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71995/
---

Review request for hive, Gopal V, Jesús Camacho Rodríguez, and Krisztian Kasa.


Bugs: HIVE-22726
https://issues.apache.org/jira/browse/HIVE-22726


Repository: hive-git


Description
---

The TopN key optimizer currently uses a priority queue for keeping track of the 
largest/smallest rows. Its max size is the same as the user specified limit. 
This should be replaced a more cache line friendly array with a small (128) 
maximum size and see how much performance is gained.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e7724f9084f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java 4998766f064 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java b7c12502204 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 
5faa038c18d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java 
ce6efa49192 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java ff815434f0c 


Diff: https://reviews.apache.org/r/71995/diff/1/


Testing
---

with the following query:


use tpcds_bin_partitioned_orc_100;
set hive.optimize.topnkey=true;
set hive.optimize.topnkey.max=5;

select  i_item_id,
s_state, grouping(s_state) g_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 from store_sales, customer_demographics, date_dim, store, item
 where ss_sold_date_sk = d_date_sk and
   ss_item_sk = i_item_sk and
   ss_store_sk = s_store_sk and
   ss_cdemo_sk = cd_demo_sk
 group by rollup (i_item_id, s_state)
 order by i_item_id
 ,s_state
 limit 5;


Results:
  enabled:   5 rows selected (715.26 seconds)
  enabled:   5 rows selected (605.888 seconds)
  disabled:  5 rows selected (1208.168 seconds)
  disabled:  5 rows selected (1219.482 seconds)


Thanks,

Attila Magyar

Re: Review Request 71888: HIVE-22568: Process compaction candidates in parallel by the Initiator

2020-01-14 Thread Peter Vary via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71888/#review219251
---


Ship it!




Ship It!

- Peter Vary


On dec. 6, 2019, 12:54 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71888/
> ---
> 
> (Updated dec. 6, 2019, 12:54 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22568
> https://issues.apache.org/jira/browse/HIVE-22568
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> `checkForCompaction` includes many file metadata checks and may be expensive. 
> Therefore, make sense using a thread pool here and running 
> `checkForCompactions` in parallel.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4393a2825e 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7a0e32463d 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 564839324f 
> 
> 
> Diff: https://reviews.apache.org/r/71888/diff/1/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>

[jira] [Created] (HIVE-22726) TopN Key optimizer should use array instead of priority queue

2020-01-14 Thread Attila Magyar (Jira)

Attila Magyar created HIVE-22726:


 Summary: TopN Key optimizer should use array instead of priority 
queue
 Key: HIVE-22726
 URL: https://issues.apache.org/jira/browse/HIVE-22726
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Attila Magyar
Assignee: Attila Magyar
 Fix For: 4.0.0


The TopN key optimizer currently uses a priority queue for keeping track of the 
largest/smallest rows. Its max size is the same as the user specified limit. 
This should be replaced a more cache line friendly array with a small (128) 
maximum size and see how much performance is gained.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22728) Limit the scope of uniqueness of constraint name to database

Re: Review Request 71761: HIVE-22489

[jira] [Created] (HIVE-22727) Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts

Review Request 71995: TopN Key optimizer should use array instead of priority queue

Re: Review Request 71888: HIVE-22568: Process compaction candidates in parallel by the Initiator

[jira] [Created] (HIVE-22726) TopN Key optimizer should use array instead of priority queue

6 matches

Site Navigation

Mail list logo

Footer information