> On Nov. 9, 2017, 7:51 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out
> > Line 160 (original), 160 (patched)
> > <https://reviews.apache.org/r/63442/diff/2/?file=1886244#file1886244line160>
> >
> >     bucket_small has no stats gathered. This should be NONE.
> 
> Zoltan Haindrich wrote:
>     `hive.stats.autogather` is enabled by default from `HiveConf`
> 
> Ashutosh Chauhan wrote:
>     Those are load statements, not inserts. We don't gather stats with load 
> statements only with insets.
> 
> Zoltan Haindrich wrote:
>     sorry, you are right: basic stats are not gathered in this case in any 
> way.
>     
>     But the stat state is complete; because: there is logic which scans the 
> file sizes - to calculate the datasizes; and from there HIVE-16811 can guess 
> some row counts
>     
>     
> https://github.com/kgyrtkirk/hive/blob/9f67a878512117eb5c251794adc1a91bae62fea7/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L386-L393
>     
>     Firts I would like to make the standalone table/partitioned table's 
> calculation-s are a bit more similar to eachother
>     
>     I've tried to come up with some definitions for NONE/PARTIAL/COMPLETE; 
> currently I would say the following:
>     
>     * NONE: not known
>         * on table: no information (afaik currently this can't happen)
>         * estimation tree: all nodes in the estimation tree were NONE
>     * PARTIAL:
>         * on table: the current information is estimated from data size
>         * estimation tree: contains at least one NONE/PARTIAL
>     * COMPLETE:
>         * current information is correct (calculated by statstask-s)
>         * estimation tree: the whole subtree has COMPLETE status
>     
>     If I use these definitions; then I would say that the filesystem size 
> based estimation should be considered PARTIAL.
> 
> Ashutosh Chauhan wrote:
>     Definitions sounds good. Lets use them to make sure our state calculation 
> logic is built on it.
>     Can you also add this in code comments.

I've opened HIVE-18062 to address these problems


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63442/#review190633
-----------------------------------------------------------


On Nov. 9, 2017, 5:39 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63442/
> -----------------------------------------------------------
> 
> (Updated Nov. 9, 2017, 5:39 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-17934
>     https://issues.apache.org/jira/browse/HIVE-17934
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * remove the reactive stat state guessing method
> * make the guessing only work when a new object is created
> * change the way stat objects are merged
> 
> this patch will most probably break almost all qtest outputs....
> 
> 
> Diffs
> -----
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> b3adf4e504 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out b2eda12e95 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 29eefd43a9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
>  7a3fae65e8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
>  a4f60accce 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 8ffb4ce44b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ce7c96c639 
>   ql/src/test/queries/clientpositive/lateral_view_onview2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/stats_empty_partition2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_table_stats.q.out 351ff0da0a 
>   ql/src/test/results/clientpositive/alterColumnStatsPart.q.out 858e16fe22 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 3a94a6a4e3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 7875e9693a 
>   ql/src/test/results/clientpositive/cbo_const.q.out e9f885b363 
>   ql/src/test/results/clientpositive/cbo_input26.q.out 77fc194829 
>   ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 414b715b7a 
>   ql/src/test/results/clientpositive/columnstats_quoting.q.out 683c1e274f 
>   ql/src/test/results/clientpositive/columnstats_tbllvl.q.out a2c6ead293 
>   ql/src/test/results/clientpositive/constGby.q.out c633624935 
>   ql/src/test/results/clientpositive/constant_prop_3.q.out cba4744866 
>   ql/src/test/results/clientpositive/constprog3.q.out f54168d0ee 
>   ql/src/test/results/clientpositive/correlationoptimizer10.q.out a03acd38a7 
>   ql/src/test/results/clientpositive/correlationoptimizer11.q.out cf2250790a 
>   ql/src/test/results/clientpositive/correlationoptimizer13.q.out 6d4f931213 
>   ql/src/test/results/clientpositive/correlationoptimizer14.q.out 149f33fee8 
>   ql/src/test/results/clientpositive/correlationoptimizer15.q.out 2d813b239f 
>   ql/src/test/results/clientpositive/correlationoptimizer5.q.out 68d6a54862 
>   ql/src/test/results/clientpositive/correlationoptimizer7.q.out 82fecab594 
>   ql/src/test/results/clientpositive/correlationoptimizer8.q.out f3cb988a03 
>   ql/src/test/results/clientpositive/correlationoptimizer9.q.out 5372408d2a 
>   ql/src/test/results/clientpositive/cte_mat_5.q.out 3747cec891 
>   ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 8e2e77b077 
>   ql/src/test/results/clientpositive/druid_basic2.q.out 753ccb456f 
>   ql/src/test/results/clientpositive/empty_join.q.out a4a9976a7f 
>   ql/src/test/results/clientpositive/filter_cond_pushdown_HIVE_15647.q.out 
> 779bea3a26 
>   ql/src/test/results/clientpositive/groupby_sort_6.q.out a66ec97642 
>   ql/src/test/results/clientpositive/having2.q.out 80301bfc04 
>   ql/src/test/results/clientpositive/input23.q.out 80ee81b654 
>   ql/src/test/results/clientpositive/input26.q.out 1ac082eedf 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual1.q.out 
> 74f45e58c0 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual2.q.out 
> 2ac67b294c 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual3.q.out 
> b8d9b408d7 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual4.q.out 
> e5ddc3507f 
>   ql/src/test/results/clientpositive/join_view.q.out 1d83742dd4 
>   ql/src/test/results/clientpositive/lateral_view_onview.q.out 423885e442 
>   ql/src/test/results/clientpositive/lateral_view_onview2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/list_bucket_query_oneskew_2.q.out 
> 876434fb4e 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out 
> 3acbb207a7 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
> 67fe41e223 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_sw.q.out 
> 1c672ef068 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out 
> a51637a2b9 
>   
> ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 
> 02cadb7cff 
>   ql/src/test/results/clientpositive/llap/llap_nullscan.q.out 2a891234e5 
>   ql/src/test/results/clientpositive/llap/mapjoin_hint.q.out 505524e78c 
>   ql/src/test/results/clientpositive/llap/mapreduce1.q.out 0e94e71d27 
>   ql/src/test/results/clientpositive/llap/mapreduce2.q.out 6485f587f8 
>   ql/src/test/results/clientpositive/llap/metadataonly1.q.out e6853b23e3 
>   ql/src/test/results/clientpositive/llap/reduce_deduplicate.q.out 65b74ee319 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out c7b98d3967 
>   ql/src/test/results/clientpositive/llap/subquery_multi.q.out d1579033ac 
>   ql/src/test/results/clientpositive/llap/subquery_null_agg.q.out 78ee174935 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 06a929dd0a 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out 514a7889b3 
>   ql/src/test/results/clientpositive/llap/tez_smb_empty.q.out 7a4db158c8 
>   ql/src/test/results/clientpositive/llap/vector_windowing_gby2.q.out 
> ce1881b7fb 
>   ql/src/test/results/clientpositive/llap/vector_windowing_streaming.q.out 
> 61730f59ee 
>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
> 3e246bcbe6 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb.q.out 
> de491989a5 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb_2.q.out 
> a11d66815a 
>   ql/src/test/results/clientpositive/nullgroup3.q.out fe23f39fd8 
>   ql/src/test/results/clientpositive/nullgroup5.q.out 783f6d76b6 
>   ql/src/test/results/clientpositive/partial_column_stats.q.out 44db81a443 
>   ql/src/test/results/clientpositive/perf/spark/query66.q.out 1dc0fac408 
>   ql/src/test/results/clientpositive/perf/spark/query99.q.out c0c5f136ec 
>   ql/src/test/results/clientpositive/position_alias_test_1.q.out ee81a79a0b 
>   ql/src/test/results/clientpositive/ppd_outer_join5.q.out 84c10828ce 
>   ql/src/test/results/clientpositive/ppd_repeated_alias.q.out c94002f37d 
>   ql/src/test/results/clientpositive/row__id.q.out 9aab097f21 
>   ql/src/test/results/clientpositive/semijoin4.q.out 53f6c174bd 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 
> 09caf944d2 
>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual1.q.out 
> dc9b61e39a 
>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual2.q.out 
> 82634fba44 
>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual3.q.out 
> d1b20006b0 
>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual4.q.out 
> 2bfc81d275 
>   ql/src/test/results/clientpositive/spark/join_view.q.out 61867f75f3 
>   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out d294f4910c 
>   ql/src/test/results/clientpositive/spark/ppd_outer_join5.q.out e49260aa35 
>   ql/src/test/results/clientpositive/spark/semijoin.q.out d2dac10f3f 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out e2f68a02bc 
>   
> ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out
>  d7b445baf8 
>   
> ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out
>  1a8e9ffcc5 
>   ql/src/test/results/clientpositive/spark/subquery_in.q.out fd25e36fba 
>   ql/src/test/results/clientpositive/spark/subquery_multi.q.out b91c33ee4a 
>   ql/src/test/results/clientpositive/spark/subquery_null_agg.q.out 945e2a7102 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out 8f3ac0d636 
>   ql/src/test/results/clientpositive/spark/subquery_select.q.out edb2b92f73 
>   ql/src/test/results/clientpositive/spark/union_remove_25.q.out f681428785 
>   ql/src/test/results/clientpositive/spark/vectorization_short_regress.q.out 
> 78740fec6f 
>   ql/src/test/results/clientpositive/stats_empty_partition2.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out ef06dfe697 
>   ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out 
> 79b7d83619 
>   ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out 
> a202e45be9 
>   ql/src/test/results/clientpositive/union_remove_25.q.out 20ab809cb1 
>   ql/src/test/results/clientpositive/union_view.q.out 35f8a9a226 
> 
> 
> Diff: https://reviews.apache.org/r/63442/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to