-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39836/#review105945
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java (line 192)
<https://reviews.apache.org/r/39836/#comment164644>

    Would be better if we do this only when we have complete column stats? 
Incomplete/missing column stats can lead to underestimation. Over estimation is 
sometimes fine (thanks to auto-reducer parallelism) but under estimation will 
hurt performance.


- Prasanth_J


On Oct. 31, 2015, 10:11 p.m., Ashutosh Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39836/
> -----------------------------------------------------------
> 
> (Updated Oct. 31, 2015, 10:11 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Bugs: HIVE-12309
>     https://issues.apache.org/jira/browse/HIVE-12309
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> TableScan should use column stats when available for better data size estimate
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java e1f8ebc 
>   ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out 
> fc4f294 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 054b573 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 1b9ec68 
>   ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out be3fa1d 
>   ql/src/test/results/clientpositive/annotate_stats_join.q.out bc44cc3 
>   ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out c864c04 
>   ql/src/test/results/clientpositive/annotate_stats_limit.q.out 7300ea0 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out cf523cb 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 877037d 
>   ql/src/test/results/clientpositive/annotate_stats_table.q.out ebc6c5b 
>   ql/src/test/results/clientpositive/annotate_stats_union.q.out e09dde3 
>   ql/src/test/results/clientpositive/cbo_rp_auto_join0.q.out d1bc6d4 
>   ql/src/test/results/clientpositive/cbo_rp_auto_join1.q.out 3b053fe 
>   ql/src/test/results/clientpositive/cbo_rp_join0.q.out a8bcc90 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_full.q.out 
> f87a539 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out 
> 5903cd1 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial_ndv.q.out 
> 2ea1e6e 
>   ql/src/test/results/clientpositive/llap/llapdecider.q.out 676a0e4 
>   ql/src/test/results/clientpositive/spark/annotate_stats_join.q.out 8955a61 
>   ql/src/test/results/clientpositive/stats_ppr_all.q.out 7627f7a 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out ec434f0 
>   ql/src/test/results/clientpositive/tez/llapdecider.q.out 676a0e4 
> 
> Diff: https://reviews.apache.org/r/39836/diff/
> 
> 
> Testing
> -------
> 
> Existing tests
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>

Reply via email to