[jira] [Created] (HIVE-20903) Cleanup code inspection issue on the druid adapter.
slim bouguerra created HIVE-20903: - Summary: Cleanup code inspection issue on the druid adapter. Key: HIVE-20903 URL: https://issues.apache.org/jira/browse/HIVE-20903 Project: Hive Issue Type: Improvement Reporter: slim bouguerra Assignee: slim bouguerra This is a simple cleanup of the code and minor refactor. I did not change any of the behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20902) Math.abs(rand.nextInt())) or Math.abs(rand.nexLong())) can return a negative number
slim bouguerra created HIVE-20902: - Summary: Math.abs(rand.nextInt())) or Math.abs(rand.nexLong())) can return a negative number Key: HIVE-20902 URL: https://issues.apache.org/jira/browse/HIVE-20902 Project: Hive Issue Type: Bug Reporter: slim bouguerra Assignee: slim bouguerra i see a lot of Math.abs(rand.nextInt())) in the code base and this can return a negative number. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20901) running compactor when there is nothing to do produces duplicate data
Eugene Koifman created HIVE-20901: - Summary: running compactor when there is nothing to do produces duplicate data Key: HIVE-20901 URL: https://issues.apache.org/jira/browse/HIVE-20901 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 4.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman suppose we run minor compaction 2 times, via alter table The 2nd request to compaction should have nothing to do but I don't think there is a check for that. It's visible in the context of HIVE-20823, where each compactor run produces a delta with new visibility suffix so we end up with something like {noformat} target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands3-1541810844849/warehouse/t/ ├── delete_delta_001_002_v019 │ ├── _orc_acid_version │ └── bucket_0 ├── delete_delta_001_002_v021 │ ├── _orc_acid_version │ └── bucket_0 ├── delta_001_001_ │ ├── _orc_acid_version │ └── bucket_0 ├── delta_001_002_v019 │ ├── _orc_acid_version │ └── bucket_0 ├── delta_001_002_v021 │ ├── _orc_acid_version │ └── bucket_0 └── delta_002_002_ ├── _orc_acid_version └── bucket_0{noformat} i.e. 2 deltas with the same write ID range this is bad. Probably happens today as well but new run produces a delta with the same name and clobbers the previous one, which may interfere with writers need to investigate -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats
Jason Dere created HIVE-20900: - Summary: serde2.JsonSerDe no longer supports timestamp.formats Key: HIVE-20900 URL: https://issues.apache.org/jira/browse/HIVE-20900 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Looks like HIVE-18545 broke this. Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the format in that test used the ISO timestamp format which apparently is now parsed by the default timestamp parsing, so the test was too simple. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 69294: HIVE-20826 Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69294/ --- (Updated Nov. 9, 2018, 10:45 p.m.) Review request for hive, Ashutosh Chauhan and Zoltan Haindrich. Changes --- Fixed failing tests + checkstyle fixes Bugs: HIVE-20826 https://issues.apache.org/jira/browse/HIVE-20826 Repository: hive-git Description --- See jira Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSemiJoinRule.java 7799090d43 ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 11c8f5f02c ql/src/test/queries/clientpositive/semijoin.q 144069bbe6 ql/src/test/results/clientpositive/llap/explainuser_1.q.out a87890f89e ql/src/test/results/clientpositive/llap/optimize_join_ptp.q.out af55d4d5f3 ql/src/test/results/clientpositive/llap/semijoin.q.out 531ef46c78 ql/src/test/results/clientpositive/llap/subquery_in.q.out 3222e2f616 ql/src/test/results/clientpositive/llap/subquery_views.q.out 4c723dce6b ql/src/test/results/clientpositive/llap/vector_mapjoin_reduce.q.out ad57fcd666 ql/src/test/results/clientpositive/perf/tez/cbo_query14.q.out 9bb4f2e7f2 ql/src/test/results/clientpositive/perf/tez/query14.q.out c078c271ec ql/src/test/results/clientpositive/spark/semijoin.q.out a787bce4b4 ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 0bdc44be8f ql/src/test/results/clientpositive/spark/subquery_in.q.out 7063a794cb ql/src/test/results/clientpositive/spark/subquery_views.q.out 15400893b2 ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out 6634a047f6 Diff: https://reviews.apache.org/r/69294/diff/2/ Changes: https://reviews.apache.org/r/69294/diff/1-2/ Testing --- Thanks, Vineet Garg
Re: Review Request 69257: HIVE-20842 Fix logic introduced in HIVE-20660 to estimate statistics for group by
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69257/ --- (Updated Nov. 9, 2018, 10:44 p.m.) Review request for hive and Ashutosh Chauhan. Changes --- Uploaded wrong patch before Bugs: HIVE-20842 https://issues.apache.org/jira/browse/HIVE-20842 Repository: hive-git Description --- See jIRA Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java f0b41f36f3 ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 32fba6c8ff ql/src/test/results/clientpositive/annotate_stats_groupby.q.out fe30d3197c ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 91369679aa ql/src/test/results/clientpositive/llap/bucket_groupby.q.out 4e6885d9a2 ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out 8fe30f0cc5 ql/src/test/results/clientpositive/llap/check_constraint.q.out 72736807cc ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 23bab345ca ql/src/test/results/clientpositive/llap/correlationoptimizer1.q.out efa2dd818a ql/src/test/results/clientpositive/llap/correlationoptimizer6.q.out 93a3017696 ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out bd3c7769a4 ql/src/test/results/clientpositive/llap/dynpart_sort_optimization2.q.out 30074abaf2 ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out dca0ebdab8 ql/src/test/results/clientpositive/llap/except_distinct.q.out c155a73c96 ql/src/test/results/clientpositive/llap/explainuser_1.q.out a87890f89e ql/src/test/results/clientpositive/llap/explainuser_2.q.out 51465324d2 ql/src/test/results/clientpositive/llap/explainuser_4.q.out bf20c3d8dc ql/src/test/results/clientpositive/llap/intersect_all.q.out dbb77d1abc ql/src/test/results/clientpositive/llap/intersect_distinct.q.out 604c7bbd63 ql/src/test/results/clientpositive/llap/intersect_merge.q.out b19fd2c4ec ql/src/test/results/clientpositive/llap/limit_pushdown.q.out a84a7b3db3 ql/src/test/results/clientpositive/llap/limit_pushdown3.q.out 8d5848bcd3 ql/src/test/results/clientpositive/llap/mrr.q.out a8aceea293 ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 133d8af9e9 ql/src/test/results/clientpositive/llap/parallel.q.out 692bb8ca74 ql/src/test/results/clientpositive/llap/parallel_colstats.q.out 91a450accf ql/src/test/results/clientpositive/llap/ptf.q.out b719f73566 ql/src/test/results/clientpositive/llap/reduce_deduplicate_distinct.q.out 8d04800040 ql/src/test/results/clientpositive/llap/reduce_deduplicate_extended.q.out 54dc0f7b8f ql/src/test/results/clientpositive/llap/selectDistinctStar.q.out 17601acc2d ql/src/test/results/clientpositive/llap/sharedworkext.q.out ca2b4d6750 ql/src/test/results/clientpositive/llap/sqlmerge_stats.q.out cd178cff4c ql/src/test/results/clientpositive/llap/subquery_in.q.out 3222e2f616 ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 3839696882 ql/src/test/results/clientpositive/llap/subquery_multi.q.out 7b00d69754 ql/src/test/results/clientpositive/llap/subquery_notin.q.out 37e7562818 ql/src/test/results/clientpositive/llap/subquery_scalar.q.out c72e4b2097 ql/src/test/results/clientpositive/llap/subquery_select.q.out 6870ad1873 ql/src/test/results/clientpositive/llap/subquery_views.q.out 4c723dce6b ql/src/test/results/clientpositive/llap/tez_join_hash.q.out 2ac8400576 ql/src/test/results/clientpositive/llap/tez_union2.q.out ef0d4bd71a ql/src/test/results/clientpositive/llap/tez_union_multiinsert.q.out 05d259b0d9 ql/src/test/results/clientpositive/llap/unionDistinct_1.q.out b1eec43b72 ql/src/test/results/clientpositive/llap/unionDistinct_3.q.out 5337820ab0 ql/src/test/results/clientpositive/llap/vector_adaptor_usage_mode.q.out 52b17cf36b ql/src/test/results/clientpositive/llap/vector_char_2.q.out 1ba0ab6920 ql/src/test/results/clientpositive/llap/vector_distinct_2.q.out e72e398e4b ql/src/test/results/clientpositive/llap/vector_groupby_3.q.out 3ea544e4b8 ql/src/test/results/clientpositive/llap/vector_groupby_grouping_sets2.q.out 7bee405977 ql/src/test/results/clientpositive/llap/vector_groupby_reduce.q.out 3696cad941 ql/src/test/results/clientpositive/llap/vector_outer_reference_windowed.q.out fcde000739 ql/src/test/results/clientpositive/llap/vector_windowing.q.out 8e8c445af7 ql/src/test/results/clientpositive/llap/vector_windowing_gby2.q.out 5943548a6c ql/src/test/results/clientpositive/llap/vectorization_limit.q.out 3dc640a300 ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out f929706757 ql/src/test/results/clientpositive/llap/vectorized_ptf.q.out 56e81aa819
Re: Review Request 69257: HIVE-20842 Fix logic introduced in HIVE-20660 to estimate statistics for group by
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69257/ --- (Updated Nov. 9, 2018, 10:42 p.m.) Review request for hive and Ashutosh Chauhan. Changes --- Updated tests + checkstyle fixes Bugs: HIVE-20842 https://issues.apache.org/jira/browse/HIVE-20842 Repository: hive-git Description --- See jIRA Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSemiJoinRule.java 7799090d43 ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 11c8f5f02c ql/src/test/queries/clientpositive/semijoin.q 144069bbe6 ql/src/test/results/clientpositive/llap/explainuser_1.q.out a87890f89e ql/src/test/results/clientpositive/llap/optimize_join_ptp.q.out af55d4d5f3 ql/src/test/results/clientpositive/llap/semijoin.q.out 531ef46c78 ql/src/test/results/clientpositive/llap/subquery_in.q.out 3222e2f616 ql/src/test/results/clientpositive/llap/subquery_views.q.out 4c723dce6b ql/src/test/results/clientpositive/llap/vector_mapjoin_reduce.q.out ad57fcd666 ql/src/test/results/clientpositive/perf/tez/cbo_query14.q.out 9bb4f2e7f2 ql/src/test/results/clientpositive/perf/tez/query14.q.out c078c271ec ql/src/test/results/clientpositive/spark/semijoin.q.out a787bce4b4 ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 0bdc44be8f ql/src/test/results/clientpositive/spark/subquery_in.q.out 7063a794cb ql/src/test/results/clientpositive/spark/subquery_views.q.out 15400893b2 ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out 6634a047f6 Diff: https://reviews.apache.org/r/69257/diff/4/ Changes: https://reviews.apache.org/r/69257/diff/3-4/ Testing --- Thanks, Vineet Garg
Re: Review Request 69257: HIVE-20842 Fix logic introduced in HIVE-20660 to estimate statistics for group by
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69257/ --- (Updated Nov. 9, 2018, 10:16 p.m.) Review request for hive and Ashutosh Chauhan. Changes --- Updated the logic to fix tests Bugs: HIVE-20842 https://issues.apache.org/jira/browse/HIVE-20842 Repository: hive-git Description --- See jIRA Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java f0b41f36f3 ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 32fba6c8ff ql/src/test/results/clientpositive/annotate_stats_groupby.q.out fe30d3197c ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 91369679aa ql/src/test/results/clientpositive/llap/bucket_groupby.q.out 4e6885d9a2 ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out 8fe30f0cc5 ql/src/test/results/clientpositive/llap/check_constraint.q.out 72736807cc ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 23bab345ca ql/src/test/results/clientpositive/llap/correlationoptimizer1.q.out efa2dd818a ql/src/test/results/clientpositive/llap/correlationoptimizer6.q.out 93a3017696 ql/src/test/results/clientpositive/llap/dynpart_sort_opt_vectorization.q.out bd3c7769a4 ql/src/test/results/clientpositive/llap/dynpart_sort_optimization2.q.out 30074abaf2 ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out dca0ebdab8 ql/src/test/results/clientpositive/llap/except_distinct.q.out c155a73c96 ql/src/test/results/clientpositive/llap/explainuser_1.q.out a87890f89e ql/src/test/results/clientpositive/llap/explainuser_2.q.out 51465324d2 ql/src/test/results/clientpositive/llap/explainuser_4.q.out bf20c3d8dc ql/src/test/results/clientpositive/llap/intersect_all.q.out dbb77d1abc ql/src/test/results/clientpositive/llap/intersect_distinct.q.out 604c7bbd63 ql/src/test/results/clientpositive/llap/intersect_merge.q.out b19fd2c4ec ql/src/test/results/clientpositive/llap/limit_pushdown.q.out a84a7b3db3 ql/src/test/results/clientpositive/llap/limit_pushdown3.q.out 8d5848bcd3 ql/src/test/results/clientpositive/llap/mrr.q.out a8aceea293 ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 133d8af9e9 ql/src/test/results/clientpositive/llap/parallel.q.out 692bb8ca74 ql/src/test/results/clientpositive/llap/parallel_colstats.q.out 91a450accf ql/src/test/results/clientpositive/llap/ptf.q.out b719f73566 ql/src/test/results/clientpositive/llap/reduce_deduplicate_distinct.q.out 8d04800040 ql/src/test/results/clientpositive/llap/reduce_deduplicate_extended.q.out 54dc0f7b8f ql/src/test/results/clientpositive/llap/selectDistinctStar.q.out 17601acc2d ql/src/test/results/clientpositive/llap/sharedworkext.q.out ca2b4d6750 ql/src/test/results/clientpositive/llap/sqlmerge_stats.q.out cd178cff4c ql/src/test/results/clientpositive/llap/subquery_in.q.out 3222e2f616 ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 3839696882 ql/src/test/results/clientpositive/llap/subquery_multi.q.out 7b00d69754 ql/src/test/results/clientpositive/llap/subquery_notin.q.out 37e7562818 ql/src/test/results/clientpositive/llap/subquery_scalar.q.out c72e4b2097 ql/src/test/results/clientpositive/llap/subquery_select.q.out 6870ad1873 ql/src/test/results/clientpositive/llap/subquery_views.q.out 4c723dce6b ql/src/test/results/clientpositive/llap/tez_join_hash.q.out 2ac8400576 ql/src/test/results/clientpositive/llap/tez_union2.q.out ef0d4bd71a ql/src/test/results/clientpositive/llap/tez_union_multiinsert.q.out 05d259b0d9 ql/src/test/results/clientpositive/llap/unionDistinct_1.q.out b1eec43b72 ql/src/test/results/clientpositive/llap/unionDistinct_3.q.out 5337820ab0 ql/src/test/results/clientpositive/llap/vector_adaptor_usage_mode.q.out 52b17cf36b ql/src/test/results/clientpositive/llap/vector_char_2.q.out 1ba0ab6920 ql/src/test/results/clientpositive/llap/vector_distinct_2.q.out e72e398e4b ql/src/test/results/clientpositive/llap/vector_groupby_3.q.out 3ea544e4b8 ql/src/test/results/clientpositive/llap/vector_groupby_grouping_sets2.q.out 7bee405977 ql/src/test/results/clientpositive/llap/vector_groupby_reduce.q.out 3696cad941 ql/src/test/results/clientpositive/llap/vector_outer_reference_windowed.q.out fcde000739 ql/src/test/results/clientpositive/llap/vector_windowing.q.out 8e8c445af7 ql/src/test/results/clientpositive/llap/vector_windowing_gby2.q.out 5943548a6c ql/src/test/results/clientpositive/llap/vectorization_limit.q.out 3dc640a300 ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out f929706757 ql/src/test/results/clientpositive/llap/vectorized_ptf.q.out 56e81aa819
[jira] [Created] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only
Gour Saha created HIVE-20899: Summary: Keytab URI for LLAP YARN Service is restrictive to support HDFS only Key: HIVE-20899 URL: https://issues.apache.org/jira/browse/HIVE-20899 Project: Hive Issue Type: Bug Components: llap Affects Versions: 3.1.1 Reporter: Gour Saha llap-server/src/main/resources/package.py restricts the keytab URI to support HDFS only and hence fails for other FileSystem API conforming FSs like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Hive ACID files compacted directory rename on cloud blob stores.
>To me it looks like this problem will be solved by >https://issues.apache.org/jira/browse/HIVE-20823, but until then, is this >broken or I have missed a crucial detail? Yes, S3Guard. https://www.slideshare.net/hortonworks/s3guard-whats-in-your-consistency-model However, that's another daemon you need to run (+ provision DynamoDB etc). It is not the most convenient of setups to run on S3. Cheers, Gopal
[jira] [Created] (HIVE-20898) For time related functions arguments may not be casted to a non nullable type
Zoltan Haindrich created HIVE-20898: --- Summary: For time related functions arguments may not be casted to a non nullable type Key: HIVE-20898 URL: https://issues.apache.org/jira/browse/HIVE-20898 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} create table t (a string); insert into t values (null),('1988-11-11'); set hive.cbo.enable=true; select 'expected 1 (second)', count(1) from t where second(a) is null; {code} this may only cause trouble if Calcite is exploiting the datatype nullability. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20897) TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error
mahesh kumar behera created HIVE-20897: -- Summary: TestJdbcDriver2#testSelectExecAsync2 fails with result set not present error Key: HIVE-20897 URL: https://issues.apache.org/jira/browse/HIVE-20897 Project: Hive Issue Type: Bug Components: Hive Reporter: mahesh kumar behera Assignee: mahesh kumar behera if async prepare is enabled, control will be returned to the client before driver could set of the query has a result set or not. But in current code, while generating the response for the query, it is not checked if the result set field is set or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Hive ACID files compacted directory rename on cloud blob stores.
Hi, I have a question on using Hive ACID on Hive 3.x against cloud blob stores, would be much obliged if someone could answer the same. As I understand it, the results of a compaction(major or minor) need to be atomically visible, so that when there are uncompacted and compacted directories present, the reader can pick the compacted ones. To illustrate my upcoming question, please consider the following example. Two delta directories exist: delta_41_41 delta_40_40 After minor compaction: delta_40_41 delta_41_41 delta_40_40 The reader will pick delta_40_41 as its range encompasses the rest, and ignore delta_41_41 and delta_40_40. However, for this to work correctly, the premise is that the compacted directories should be visible atomically, ie it should not be the case that some files in the compacted directory are visible but some are not. Now this would work fine on HDFS as the rename of a directory is atomic. But on cloud blob stores, as the rename is actually a copy and a delete, wouldn't the compacted directory be visible even when only a subset(even just 1) of the files have been copied, and wouldn't that lead to wrong results as the reader would pick the incompletely copied compacted delta directory? Or have I understood this incorrectly? To me it looks like this problem will be solved by https://issues.apache.org/jira/browse/HIVE-20823, but until then, is this broken or I have missed a crucial detail? PS: I found https://issues.apache.org/jira/browse/HIVE-20392... is this trying to solve this exact problem? Thanks, Abhishek