[jira] [Created] (HIVE-22728) Limit the scope of uniqueness of constraint name to database
Jesus Camacho Rodriguez created HIVE-22728: -- Summary: Limit the scope of uniqueness of constraint name to database Key: HIVE-22728 URL: https://issues.apache.org/jira/browse/HIVE-22728 Project: Hive Issue Type: Wish Reporter: Jesus Camacho Rodriguez Currently, constraint names are globally unique across all databases (assumption is that this may have done by design). Nevertheless, though behavior seems to be implementation specific, it would be interesting to limit the scope to uniqueness per database. Currently we do not store database information with the constraints. To change the scope to one db, we would need to store the DB_ID in the KEY_CONSTRAINTS table in metastore when we create a constraint and add the DB_ID to the PRIMARY KEY of that table. Some minor changes to the error messages would be needed too, since otherwise it would be difficult to identify the correct violation in queries that span across multiple databases. Additionally, the SQL scripts will need to be updated to populate the DB_ID when we upgrade to new version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 71761: HIVE-22489
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71761/ --- (Updated Jan. 14, 2020, 8:39 p.m.) Review request for hive, Jesús Camacho Rodríguez and Zoltan Haindrich. Bugs: HIVE-22489 https://issues.apache.org/jira/browse/HIVE-22489 Repository: hive-git Description --- Reduce Sink operator orders nulls first === 1. Set the default null sort order by hive config when creating Reduce Sink Desc. 2. Hash join uses `org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableSerializeWrite` or `BinarySortableDeserializeRead` for selializing keys. For bigtable keys always ascending and nulls first ordering was hardcoded. This patch changes this behaviour to use the `Operator.getConf().TableDesc.getProperties()` (in this case `MapJoinOperator`) to setup ordering in `BinarySortableSerializeWrite` 3. Use null ordering set in ReduceRecordSource at Reduce phase when comparing keys in `CommonMergeJoinOperator` (This is the null ordering of the children Reduce Sink operators) Diffs (updated) - accumulo-handler/src/test/results/positive/accumulo_queries.q.out 7c552621f2 contrib/src/test/results/clientpositive/udaf_example_group_concat.q.out 6846720d95 hbase-handler/src/test/results/positive/hbase_queries.q.out a32ef81a7b itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out e997fa65cf kudu-handler/src/test/results/positive/kudu_complex_queries.q.out 73fc3e514f ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 3974627a24 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 72446afeda ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java 2380d936f2 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyMultiKeyOperator.java f587517b08 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerMultiKeyOperator.java cdee3fd957 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinLeftSemiMultiKeyOperator.java e5d9fdae19 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinOuterMultiKeyOperator.java 29c531bd51 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMap.java a4cda921a5 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMultiSet.java 43f093d906 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashSet.java 8dce5b82d3 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java a35401d9b2 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringCommon.java 1b108a8c14 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMap.java 446feb2526 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMultiSet.java c28ef9be2b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashSet.java 17bd5fda93 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java 4ab8902a3f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedCreateHashTable.java 21c355cb42 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongCommon.java de1ee15c3b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMap.java 42573f0898 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMultiSet.java 829a03737d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashSet.java 18e1435019 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringCommon.java da0e8365b1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMap.java 6c4d8a81d1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMultiSet.java a6b754c7eb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashSet.java fdcd83dde7 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java 5c409e4573 ql/src/java/org/apache/hadoop/hive/ql/optimizer/CountDistinctRewriteProc.java a50ad78e8f ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java 0f95d7788c ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 89b55001f0 ql/src/java/org/apache/hadoop/hi
[jira] [Created] (HIVE-22727) Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts
Zoltan Chovan created HIVE-22727: Summary: Add hive db schema changes introduced in HIVE-21884 to the schema upgrade scripts Key: HIVE-22727 URL: https://issues.apache.org/jira/browse/HIVE-22727 Project: Hive Issue Type: Bug Reporter: Zoltan Chovan Assignee: Zoltan Chovan -- This message was sent by Atlassian Jira (v8.3.4#803005)
Review Request 71995: TopN Key optimizer should use array instead of priority queue
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71995/ --- Review request for hive, Gopal V, Jesús Camacho Rodríguez, and Krisztian Kasa. Bugs: HIVE-22726 https://issues.apache.org/jira/browse/HIVE-22726 Repository: hive-git Description --- The TopN key optimizer currently uses a priority queue for keeping track of the largest/smallest rows. Its max size is the same as the user specified limit. This should be replaced a more cache line friendly array with a small (128) maximum size and see how much performance is gained. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e7724f9084f ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java 4998766f064 ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java b7c12502204 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 5faa038c18d ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java ce6efa49192 ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java ff815434f0c Diff: https://reviews.apache.org/r/71995/diff/1/ Testing --- with the following query: use tpcds_bin_partitioned_orc_100; set hive.optimize.topnkey=true; set hive.optimize.topnkey.max=5; select i_item_id, s_state, grouping(s_state) g_state, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, store, item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_store_sk = s_store_sk and ss_cdemo_sk = cd_demo_sk group by rollup (i_item_id, s_state) order by i_item_id ,s_state limit 5; Results: enabled: 5 rows selected (715.26 seconds) enabled: 5 rows selected (605.888 seconds) disabled: 5 rows selected (1208.168 seconds) disabled: 5 rows selected (1219.482 seconds) Thanks, Attila Magyar
Re: Review Request 71888: HIVE-22568: Process compaction candidates in parallel by the Initiator
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71888/#review219251 --- Ship it! Ship It! - Peter Vary On dec. 6, 2019, 12:54 du, Denys Kuzmenko wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71888/ > --- > > (Updated dec. 6, 2019, 12:54 du) > > > Review request for hive, Laszlo Pinter and Peter Vary. > > > Bugs: HIVE-22568 > https://issues.apache.org/jira/browse/HIVE-22568 > > > Repository: hive-git > > > Description > --- > > `checkForCompaction` includes many file metadata checks and may be expensive. > Therefore, make sense using a thread pool here and running > `checkForCompactions` in parallel. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4393a2825e > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java > 7a0e32463d > ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java > 564839324f > > > Diff: https://reviews.apache.org/r/71888/diff/1/ > > > Testing > --- > > unit test > > > Thanks, > > Denys Kuzmenko > >
[jira] [Created] (HIVE-22726) TopN Key optimizer should use array instead of priority queue
Attila Magyar created HIVE-22726: Summary: TopN Key optimizer should use array instead of priority queue Key: HIVE-22726 URL: https://issues.apache.org/jira/browse/HIVE-22726 Project: Hive Issue Type: Bug Components: Hive Reporter: Attila Magyar Assignee: Attila Magyar Fix For: 4.0.0 The TopN key optimizer currently uses a priority queue for keeping track of the largest/smallest rows. Its max size is the same as the user specified limit. This should be replaced a more cache line friendly array with a small (128) maximum size and see how much performance is gained. -- This message was sent by Atlassian Jira (v8.3.4#803005)