HIVE-14367 : Estimated size for constant nulls is 0 (Ashutosh Chauhan via Jesus Camacho Rodriguez)
Signed-off-by: Ashutosh Chauhan <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/hive/repo Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/8efe6f7f Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/8efe6f7f Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/8efe6f7f Branch: refs/heads/master Commit: 8efe6f7f7f1ad391bd9b7689eb8fd36c7eda0dd9 Parents: 56a30f1 Author: Ashutosh Chauhan <[email protected]> Authored: Wed Jul 27 19:21:11 2016 -0700 Committer: Ashutosh Chauhan <[email protected]> Committed: Mon Aug 1 00:45:15 2016 -0700 ---------------------------------------------------------------------- .../clientpositive/udaf_example_avg.q.out | 8 +- .../clientpositive/udaf_example_max.q.out | 8 +- .../clientpositive/udaf_example_max_n.q.out | 8 +- .../clientpositive/udaf_example_min.q.out | 8 +- .../clientpositive/udaf_example_min_n.q.out | 8 +- .../stats/annotation/StatsRulesProcFactory.java | 2 +- .../apache/hadoop/hive/ql/stats/StatsUtils.java | 49 +-- .../hive/ql/udf/generic/GenericUDAFAverage.java | 10 +- .../hive/ql/udf/generic/GenericUDAFMax.java | 11 +- .../hive/ql/udf/generic/GenericUDAFMin.java | 10 + .../queries/clientpositive/vector_coalesce.q | 1 + .../clientnegative/udf_assert_true.q.out | 24 +- .../clientpositive/acid_table_stats.q.out | 8 +- .../annotate_stats_deep_filters.q.out | 6 +- .../clientpositive/annotate_stats_filter.q.out | 112 +++--- .../clientpositive/annotate_stats_groupby.q.out | 12 +- .../clientpositive/annotate_stats_join.q.out | 16 +- .../annotate_stats_join_pkfk.q.out | 74 ++-- .../clientpositive/annotate_stats_limit.q.out | 20 +- .../clientpositive/annotate_stats_select.q.out | 4 +- .../clientpositive/annotate_stats_union.q.out | 20 +- .../clientpositive/autoColumnStats_4.q.out | 8 +- .../clientpositive/autoColumnStats_7.q.out | 8 +- .../clientpositive/autoColumnStats_9.q.out | 10 +- .../cbo_rp_annotate_stats_groupby.q.out | 12 +- .../clientpositive/cbo_rp_auto_join0.q.out | 24 +- .../cbo_rp_groupby3_noskew_multi_distinct.q.out | 6 +- .../results/clientpositive/cbo_rp_join0.q.out | 50 +-- .../cbo_rp_udaf_percentile_approx_23.q.out | 16 +- .../clientpositive/constantfolding.q.out | 4 +- .../clientpositive/create_genericudaf.q.out | 8 +- .../clientpositive/decimal_precision.q.out | 8 +- .../results/clientpositive/decimal_stats.q.out | 12 +- .../results/clientpositive/decimal_udf.q.out | 8 +- .../clientpositive/fetch_aggregation.q.out | 4 +- .../test/results/clientpositive/fold_case.q.out | 6 +- .../test/results/clientpositive/groupby3.q.out | 10 +- .../results/clientpositive/groupby3_map.q.out | 6 +- .../groupby3_map_multi_distinct.q.out | 6 +- .../clientpositive/groupby3_map_skew.q.out | 10 +- .../clientpositive/groupby3_noskew.q.out | 6 +- .../groupby3_noskew_multi_distinct.q.out | 6 +- .../clientpositive/interval_arithmetic.q.out | 8 +- .../clientpositive/list_bucket_dml_12.q.out | 2 +- .../clientpositive/literal_decimal.q.out | 4 +- .../llap/dynamic_partition_pruning.q.out | 156 ++++----- .../vectorized_dynamic_partition_pruning.q.out | 156 ++++----- .../results/clientpositive/metadataonly1.q.out | 34 +- .../clientpositive/num_op_type_conv.q.out | 4 +- .../results/clientpositive/perf/query13.q.out | 4 +- .../results/clientpositive/perf/query28.q.out | 18 +- .../reduceSinkDeDuplication_pRS_key_empty.q.out | 14 +- .../clientpositive/remove_exprs_stats.q.out | 58 ++-- .../spark/annotate_stats_join.q.out | 16 +- .../results/clientpositive/spark/groupby3.q.out | 10 +- .../clientpositive/spark/groupby3_map.q.out | 6 +- .../spark/groupby3_map_multi_distinct.q.out | 6 +- .../spark/groupby3_map_skew.q.out | 10 +- .../clientpositive/spark/groupby3_noskew.q.out | 6 +- .../spark/groupby3_noskew_multi_distinct.q.out | 6 +- .../clientpositive/spark/subquery_in.q.out | 12 +- .../spark/union_remove_6_subq.q.out | 8 +- .../clientpositive/spark/vectorization_0.q.out | 46 +-- .../spark/vectorization_pushdown.q.out | 8 +- .../spark/vectorization_short_regress.q.out | 40 +-- .../spark/vectorized_mapjoin.q.out | 8 +- .../spark/vectorized_shufflejoin.q.out | 12 +- .../spark/vectorized_timestamp_funcs.q.out | 10 +- .../clientpositive/stats_list_bucket.q.out | 2 +- .../results/clientpositive/subquery_in.q.out | 12 +- .../clientpositive/subquery_multiinsert.q.out | 144 ++++---- .../results/clientpositive/subquery_notin.q.out | 38 +- .../tez/dynamic_partition_pruning.q.out | 156 ++++----- .../clientpositive/tez/explainuser_1.q.out | 346 +++++++++---------- .../clientpositive/tez/explainuser_3.q.out | 4 +- .../results/clientpositive/tez/groupby3.q.out | 10 +- .../clientpositive/tez/metadataonly1.q.out | 40 +-- .../clientpositive/tez/subquery_in.q.out | 12 +- .../clientpositive/tez/vector_aggregate_9.q.out | 8 +- .../tez/vector_aggregate_without_gby.q.out | 4 +- .../clientpositive/tez/vector_coalesce.q.out | 70 ++-- .../tez/vector_decimal_precision.q.out | 8 +- .../clientpositive/tez/vector_decimal_udf.q.out | 8 +- .../tez/vector_interval_arithmetic.q.out | 16 +- .../tez/vector_null_projection.q.out | 26 +- .../clientpositive/tez/vectorization_0.q.out | 46 +-- .../tez/vectorization_pushdown.q.out | 8 +- .../tez/vectorization_short_regress.q.out | 40 +-- .../tez/vectorized_distinct_gby.q.out | 8 +- .../vectorized_dynamic_partition_pruning.q.out | 156 ++++----- .../clientpositive/tez/vectorized_mapjoin.q.out | 8 +- .../tez/vectorized_shufflejoin.q.out | 12 +- .../tez/vectorized_timestamp_funcs.q.out | 10 +- .../clientpositive/udaf_number_format.q.out | 4 +- .../udaf_percentile_approx_23.q.out | 16 +- ql/src/test/results/clientpositive/udf3.q.out | 4 +- ql/src/test/results/clientpositive/udf4.q.out | 4 +- ql/src/test/results/clientpositive/udf7.q.out | 2 +- ql/src/test/results/clientpositive/udf8.q.out | 8 +- .../test/results/clientpositive/udf_case.q.out | 2 +- .../results/clientpositive/udf_coalesce.q.out | 2 +- .../test/results/clientpositive/udf_elt.q.out | 2 +- .../results/clientpositive/udf_greatest.q.out | 2 +- ql/src/test/results/clientpositive/udf_if.q.out | 2 +- .../test/results/clientpositive/udf_instr.q.out | 2 +- .../test/results/clientpositive/udf_least.q.out | 2 +- .../results/clientpositive/udf_locate.q.out | 2 +- .../test/results/clientpositive/udf_trunc.q.out | 4 +- .../test/results/clientpositive/udf_when.q.out | 2 +- .../results/clientpositive/udtf_stack.q.out | 6 +- .../clientpositive/union_remove_6_subq.q.out | 8 +- .../clientpositive/vector_aggregate_9.q.out | 8 +- .../vector_aggregate_without_gby.q.out | 8 +- .../clientpositive/vector_coalesce.q.out | 80 ++--- .../vector_decimal_precision.q.out | 8 +- .../clientpositive/vector_decimal_udf.q.out | 8 +- .../results/clientpositive/vector_elt.q.out | 4 +- .../vector_interval_arithmetic.q.out | 16 +- .../clientpositive/vector_null_projection.q.out | 30 +- .../results/clientpositive/vector_nvl.q.out | 4 +- .../results/clientpositive/vector_udf1.q.out | 24 +- .../clientpositive/vectorization_0.q.out | 46 +-- .../clientpositive/vectorization_pushdown.q.out | 8 +- .../vectorization_short_regress.q.out | 40 +-- .../vectorized_distinct_gby.q.out | 4 +- .../clientpositive/vectorized_mapjoin.q.out | 8 +- .../clientpositive/vectorized_shufflejoin.q.out | 12 +- .../vectorized_timestamp_funcs.q.out | 10 +- 128 files changed, 1452 insertions(+), 1441 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/contrib/src/test/results/clientpositive/udaf_example_avg.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_avg.q.out b/contrib/src/test/results/clientpositive/udaf_example_avg.q.out index 61926d4..4e6cb99 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_avg.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_avg.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_avg(_col0), example_avg(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<mcount:bigint,msum:double>), _col1 (type: struct<mcount:bigint,msum:double>) Reduce Operator Tree: Group By Operator aggregations: example_avg(VALUE._col0), example_avg(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/contrib/src/test/results/clientpositive/udaf_example_max.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_max.q.out b/contrib/src/test/results/clientpositive/udaf_example_max.q.out index 932d8df..fc5a896 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_max.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_max.q.out @@ -38,20 +38,20 @@ STAGE PLANS: aggregations: example_max(_col0), example_max(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string) Reduce Operator Tree: Group By Operator aggregations: example_max(VALUE._col0), example_max(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out b/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out index 16ae212..47d8f52 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_max_n(_col0, 10), example_max_n(_col2, 10) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<a:array<double>,n:int>), _col1 (type: struct<a:array<double>,n:int>) Reduce Operator Tree: Group By Operator aggregations: example_max_n(VALUE._col0), example_max_n(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/contrib/src/test/results/clientpositive/udaf_example_min.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_min.q.out b/contrib/src/test/results/clientpositive/udaf_example_min.q.out index b0ffe53..feb2add 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_min.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_min.q.out @@ -38,20 +38,20 @@ STAGE PLANS: aggregations: example_min(_col0), example_min(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string) Reduce Operator Tree: Group By Operator aggregations: example_min(VALUE._col0), example_min(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out b/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out index 7e7dd84..16c4684 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_min_n(_col0, 10), example_min_n(_col2, 10) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<a:array<double>,n:int>), _col1 (type: struct<a:array<double>,n:int>) Reduce Operator Tree: Group By Operator aggregations: example_min_n(VALUE._col0), example_min_n(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java index 42cbc14..ab07fb6 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java @@ -1171,7 +1171,7 @@ public class StatsRulesProcFactory { ColStatistics cs = new ColStatistics(colName, colType); cs.setCountDistint(stats.getNumRows()); cs.setNumNulls(0); - cs.setAvgColLen(StatsUtils.getAvgColLenOfFixedLengthTypes(colType)); + cs.setAvgColLen(StatsUtils.getAvgColLenOf(conf, ci.getObjectInspector(), colType)); aggColStats.add(cs); } } http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java index 7a15904..2a9dc11 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java @@ -424,7 +424,7 @@ public class StatsUtils { long numPartitions = getNDVPartitionColumn(partList.getPartitions(), ci.getInternalName()); partCS.setCountDistint(numPartitions); - partCS.setAvgColLen(StatsUtils.getAvgColLenOfVariableLengthTypes(conf, + partCS.setAvgColLen(StatsUtils.getAvgColLenOf(conf, ci.getObjectInspector(), partCS.getColumnType())); partCS.setRange(getRangePartitionColumn(partList.getPartitions(), ci.getInternalName(), ci.getType().getTypeName(), conf.getVar(ConfVars.DEFAULTPARTITIONNAME))); @@ -543,7 +543,7 @@ public class StatsUtils { || colTypeLowerCase.startsWith(serdeConstants.MAP_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.STRUCT_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.UNION_TYPE_NAME)) { - avgRowSize += getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + avgRowSize += getAvgColLenOf(conf, oi, colTypeLowerCase); } else { avgRowSize += getAvgColLenOfFixedLengthTypes(colTypeLowerCase); } @@ -805,7 +805,7 @@ public class StatsUtils { * - column type * @return raw data size */ - public static long getAvgColLenOfVariableLengthTypes(HiveConf conf, ObjectInspector oi, + public static long getAvgColLenOf(HiveConf conf, ObjectInspector oi, String colType) { long configVarLen = HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVE_STATS_MAX_VARIABLE_LENGTH); @@ -872,7 +872,7 @@ public class StatsUtils { return getSizeOfComplexTypes(conf, oi); } - return 0; + throw new IllegalArgumentException("Size requested for unknown type: " + colType + " OI: " + oi.getTypeName()); } /** @@ -895,10 +895,10 @@ public class StatsUtils { if (colTypeLowerCase.equals(serdeConstants.STRING_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.VARCHAR_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.CHAR_TYPE_NAME)) { - int avgColLen = (int) getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + int avgColLen = (int) getAvgColLenOf(conf, oi, colTypeLowerCase); result += JavaDataModel.get().lengthForStringOfLength(avgColLen); } else if (colTypeLowerCase.equals(serdeConstants.BINARY_TYPE_NAME)) { - int avgColLen = (int) getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + int avgColLen = (int) getAvgColLenOf(conf, oi, colTypeLowerCase); result += JavaDataModel.get().lengthForByteArrayOfSize(avgColLen); } else { result += getAvgColLenOfFixedLengthTypes(colTypeLowerCase); @@ -989,11 +989,13 @@ public class StatsUtils { if (colTypeLowerCase.equals(serdeConstants.TINYINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.INT_TYPE_NAME) + || colTypeLowerCase.equals(serdeConstants.VOID_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.BOOLEAN_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.FLOAT_TYPE_NAME)) { return JavaDataModel.get().primitive1(); } else if (colTypeLowerCase.equals(serdeConstants.DOUBLE_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.BIGINT_TYPE_NAME) + || colTypeLowerCase.equals(serdeConstants.INTERVAL_YEAR_MONTH_TYPE_NAME) || colTypeLowerCase.equals("long")) { return JavaDataModel.get().primitive2(); } else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) { @@ -1002,8 +1004,10 @@ public class StatsUtils { return JavaDataModel.get().lengthOfDate(); } else if (colTypeLowerCase.startsWith(serdeConstants.DECIMAL_TYPE_NAME)) { return JavaDataModel.get().lengthOfDecimal(); + } else if (colTypeLowerCase.equals(serdeConstants.INTERVAL_DAY_TIME_TYPE_NAME)) { + return JavaDataModel.JAVA32_META; } else { - return 0; + throw new IllegalArgumentException("Size requested for unknown type: " + colType); } } @@ -1225,7 +1229,7 @@ public class StatsUtils { double avgColSize = 0; long countDistincts = 0; long numNulls = 0; - ObjectInspector oi = null; + ObjectInspector oi = end.getWritableObjectInspector(); long numRows = parentStats.getNumRows(); if (end instanceof ExprNodeColumnDesc) { @@ -1244,7 +1248,6 @@ public class StatsUtils { // virtual columns colType = encd.getTypeInfo().getTypeName(); countDistincts = numRows; - oi = encd.getWritableObjectInspector(); } else { // clone the column stats and return @@ -1263,16 +1266,13 @@ public class StatsUtils { // constant projection ExprNodeConstantDesc encd = (ExprNodeConstantDesc) end; - // null projection + colName = encd.getName(); + colType = encd.getTypeString(); if (encd.getValue() == null) { - colName = encd.getName(); - colType = serdeConstants.VOID_TYPE_NAME; + // null projection numNulls = numRows; } else { - colName = encd.getName(); - colType = encd.getTypeString(); countDistincts = 1; - oi = encd.getWritableObjectInspector(); } } else if (end instanceof ExprNodeGenericFuncDesc) { @@ -1281,7 +1281,6 @@ public class StatsUtils { colName = engfd.getName(); colType = engfd.getTypeString(); countDistincts = getNDVFor(engfd, numRows, parentStats); - oi = engfd.getWritableObjectInspector(); } else if (end instanceof ExprNodeColumnListDesc) { // column list @@ -1289,7 +1288,6 @@ public class StatsUtils { colName = Joiner.on(",").join(encd.getCols()); colType = serdeConstants.LIST_TYPE_NAME; countDistincts = numRows; - oi = encd.getWritableObjectInspector(); } else if (end instanceof ExprNodeFieldDesc) { // field within complex type @@ -1297,25 +1295,12 @@ public class StatsUtils { colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; - oi = enfd.getWritableObjectInspector(); } else { throw new IllegalArgumentException("not supported expr type " + end.getClass()); } colType = colType.toLowerCase(); - if (colType.equals(serdeConstants.STRING_TYPE_NAME) - || colType.equals(serdeConstants.BINARY_TYPE_NAME) - || colType.startsWith(serdeConstants.VARCHAR_TYPE_NAME) - || colType.startsWith(serdeConstants.CHAR_TYPE_NAME) - || colType.startsWith(serdeConstants.LIST_TYPE_NAME) - || colType.startsWith(serdeConstants.MAP_TYPE_NAME) - || colType.startsWith(serdeConstants.STRUCT_TYPE_NAME) - || colType.startsWith(serdeConstants.UNION_TYPE_NAME)) { - avgColSize = getAvgColLenOfVariableLengthTypes(conf, oi, colType); - } else { - avgColSize = getAvgColLenOfFixedLengthTypes(colType); - } - + avgColSize = getAvgColLenOf(conf, oi, colType); ColStatistics colStats = new ColStatistics(colName, colType); colStats.setAvgColLen(avgColSize); colStats.setCountDistint(countDistincts); @@ -1456,7 +1441,7 @@ public class StatsUtils { for (ColStatistics cs : colStats) { if (cs != null) { String colTypeLowerCase = cs.getColumnType().toLowerCase(); - long nonNullCount = numRows - cs.getNumNulls(); + long nonNullCount = cs.getNumNulls() > 0 ? numRows - cs.getNumNulls() + 1 : numRows; double sizeOf = 0; if (colTypeLowerCase.equals(serdeConstants.TINYINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME) http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java index 6799978..5ad5c06 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java @@ -28,7 +28,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.parse.SemanticException; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AbstractAggregationBuffer; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationBuffer; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.io.DoubleWritable; import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable; @@ -358,10 +360,16 @@ public class GenericUDAFAverage extends AbstractGenericUDAFResolver { } } - private static class AverageAggregationBuffer<TYPE> implements AggregationBuffer { + @AggregationType(estimable = true) + private static class AverageAggregationBuffer<TYPE> extends AbstractAggregationBuffer { private HashSet<ObjectInspectorObject> uniqueObjects; // Unique rows. private long count; private TYPE sum; + + @Override + public int estimate() { + return 2*JavaDataModel.PRIMITIVES2; + } }; @SuppressWarnings("unchecked") http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java index 43b23fa..763bfd5 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java @@ -31,6 +31,7 @@ import org.apache.hadoop.hive.ql.plan.ptf.BoundaryDef; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; import org.apache.hadoop.hive.ql.udf.UDFType; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationBuffer; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; @@ -79,8 +80,13 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { } /** class for storing the current max value */ + @AggregationType(estimable = true) static class MaxAgg extends AbstractAggregationBuffer { Object o; + @Override + public int estimate() { + return JavaDataModel.PRIMITIVES2; + } } @Override @@ -138,7 +144,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { /* * Based on the Paper by Daniel Lemire: Streaming Max-Min filter using no more * than 3 comparisons per elem. - * + * * 1. His algorithm works on fixed size windows up to the current row. For row * 'i' and window 'w' it computes the min/max for window (i-w, i). 2. The core * idea is to keep a queue of (max, idx) tuples. A tuple in the queue @@ -150,7 +156,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { * element at the front of the queue has reached its max range of influence; * i.e. frontTuple.idx + w > i. If yes we can remove it from the queue. - on * the ith step o/p the front of the queue as the max for the ith entry. - * + * * Here we modify the algorithm: 1. to handle window's that are of the form * (i-p, i+f), where p is numPreceding,f = numFollowing - we start outputing * rows only after receiving f rows. - the formula for 'influence range' of an @@ -192,6 +198,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { + (3 * JavaDataModel.PRIMITIVES1); } + @Override protected void reset() { maxChain.clear(); super.reset(); http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java index 70e0db1..132bad6 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java @@ -26,7 +26,9 @@ import org.apache.hadoop.hive.ql.parse.SemanticException; import org.apache.hadoop.hive.ql.plan.ptf.BoundaryDef; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; import org.apache.hadoop.hive.ql.udf.UDFType; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax.MaxStreamingFixedWindow; +import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.FullMapEqualComparer; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; @@ -76,8 +78,13 @@ public class GenericUDAFMin extends AbstractGenericUDAFResolver { } /** class for storing the current max value */ + @AggregationType(estimable = true) static class MinAgg extends AbstractAggregationBuffer { Object o; + @Override + public int estimate() { + return JavaDataModel.PRIMITIVES2; + } } @Override @@ -139,14 +146,17 @@ public class GenericUDAFMin extends AbstractGenericUDAFResolver { super(wrappedEval, wFrmDef); } + @Override protected ObjectInspector inputOI() { return ((GenericUDAFMinEvaluator) wrappedEval).inputOI; } + @Override protected ObjectInspector outputOI() { return ((GenericUDAFMinEvaluator) wrappedEval).outputOI; } + @Override protected boolean removeLast(Object in, Object last) { return isLess(in, last); } http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/queries/clientpositive/vector_coalesce.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/vector_coalesce.q b/ql/src/test/queries/clientpositive/vector_coalesce.q index b1a7766..cfba7be 100644 --- a/ql/src/test/queries/clientpositive/vector_coalesce.q +++ b/ql/src/test/queries/clientpositive/vector_coalesce.q @@ -1,3 +1,4 @@ +set hive.stats.fetch.column.stats=true; set hive.explain.user=false; SET hive.vectorized.execution.enabled=true; http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientnegative/udf_assert_true.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientnegative/udf_assert_true.q.out b/ql/src/test/results/clientnegative/udf_assert_true.q.out index baa9074..7fc50d6 100644 --- a/ql/src/test/results/clientnegative/udf_assert_true.q.out +++ b/ql/src/test/results/clientnegative/udf_assert_true.q.out @@ -28,13 +28,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 > 0)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -52,13 +52,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 > 0)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -105,13 +105,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 < 2)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -129,13 +129,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 < 2)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/acid_table_stats.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/acid_table_stats.q.out b/ql/src/test/results/clientpositive/acid_table_stats.q.out index 4d51511..d5c509c 100644 --- a/ql/src/test/results/clientpositive/acid_table_stats.q.out +++ b/ql/src/test/results/clientpositive/acid_table_stats.q.out @@ -539,20 +539,20 @@ STAGE PLANS: aggregations: max(key) mode: hash outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string) Reduce Operator Tree: Group By Operator aggregations: max(VALUE._col0) mode: mergepartial outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out b/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out index b7a87fd..32644dc 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out @@ -118,12 +118,12 @@ STAGE PLANS: Map Operator Tree: TableScan alias: over1k - Statistics: Num rows: 2098 Data size: 16736 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 2098 Data size: 16744 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((t = 1) and (si = 2)) or ((t = 2) and (si = 3)) or ((t = 3) and (si = 4)) or ((t = 4) and (si = 5)) or ((t = 5) and (si = 6)) or ((t = 6) and (si = 7)) or ((t = 7) and (si = 8)) or ((t = 9) and (si = 10)) or ((t = 10) and (si = 11)) or ((t = 11) and (si = 12)) or ((t = 12) and (si = 13)) or ((t = 13) and (si = 14)) or ((t = 14) and (si = 15)) or ((t = 15) and (si = 16)) or ((t = 16) and (si = 17)) or ((t = 17) and (si = 18)) or ((t = 27) and (si = 28)) or ((t = 37) and (si = 38)) or ((t = 47) and (si = 48)) or ((t = 52) and (si = 53))) (type: boolean) - Statistics: Num rows: 300 Data size: 2392 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 300 Data size: 2400 Basic stats: COMPLETE Column stats: COMPLETE Select Operator - Statistics: Num rows: 300 Data size: 2392 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 300 Data size: 2400 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/annotate_stats_filter.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_filter.q.out b/ql/src/test/results/clientpositive/annotate_stats_filter.q.out index a606e30..bd0b3bb 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_filter.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_filter.q.out @@ -141,7 +141,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state = 'OH') (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -181,17 +181,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state <> 'OH') (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -217,17 +217,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state <> 'OH') (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -257,17 +257,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is null (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), null (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -295,17 +295,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is null (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), null (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -335,17 +335,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is not null (type: boolean) - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -373,17 +373,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is not null (type: boolean) - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -413,11 +413,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: -- numRows: 0 rawDataSize: 0 @@ -436,7 +436,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -476,11 +476,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: -- numRows: 8 rawDataSize: 804 @@ -499,17 +499,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: 'foo' (type: string) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -537,11 +537,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: -- numRows: 0 rawDataSize: 0 @@ -560,7 +560,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -598,7 +598,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -636,7 +636,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -676,7 +676,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((state = 'OH') or (state = 'CA')) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -716,7 +716,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -754,7 +754,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -794,7 +794,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((year = 2001) and year is null) or (state = 'CA')) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -834,7 +834,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((year = 2001) or year is null) and (state = 'CA')) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -876,17 +876,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid < 30) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -912,7 +912,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid > 30) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -948,17 +948,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid <= 30) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -984,7 +984,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid >= 30) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -1024,7 +1024,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid < 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -1060,7 +1060,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid > 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -1096,7 +1096,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid <= 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -1132,7 +1132,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid >= 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out b/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out index 3070407..f6971a0 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out @@ -248,21 +248,21 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: int) outputColumnNames: year - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: year (type: int) mode: hash outputColumnNames: _col0 - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: int) @@ -682,11 +682,11 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: int) outputColumnNames: year - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: year (type: int) mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/annotate_stats_join.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_join.q.out b/ql/src/test/results/clientpositive/annotate_stats_join.q.out index 4398f1b..0c21c66 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_join.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_join.q.out @@ -560,19 +560,19 @@ STAGE PLANS: value expressions: _col1 (type: string) TableScan alias: l - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: locid is not null (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col2 (type: bigint), _col3 (type: int) Reduce Operator Tree: Join Operator @@ -648,19 +648,19 @@ STAGE PLANS: Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: l - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state is not null and locid is not null) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ Map-reduce partition columns: _col0 (type: string), _col1 (type: int) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: bigint), _col3 (type: int) Reduce Operator Tree: Join Operator http://git-wip-us.apache.org/repos/asf/hive/blob/8efe6f7f/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out b/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out index 64a57fe..224f1ff 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out @@ -289,19 +289,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -354,19 +354,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk > 0) (type: boolean) - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -419,19 +419,19 @@ STAGE PLANS: Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: PARTIAL TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -484,19 +484,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -549,19 +549,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -599,19 +599,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE @@ -685,7 +685,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk > 1000) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE @@ -771,19 +771,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: COMPLETE @@ -857,19 +857,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE @@ -944,19 +944,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7656 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7664 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk is not null and ss_addr_sk is not null) (type: boolean) - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_addr_sk (type: int), ss_store_sk (type: int) outputColumnNames: _col0, _col1 - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: int) TableScan alias: s
