Re: Review Request 57614: Auto-gather column stats - phase 2

2017-07-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review180561
---




ql/src/test/results/clientpositive/llap/bucket_map_join_tez1.q.out
Lines 759-760 (original), 767-769 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez1.q.out
Lines 896-897 (original), 912-913 (patched)


Turn off auto-gather to restore original intnet of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez1.q.out
Line 1141 (original), 1158-1159 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez1.q.out
Line 1458 (original), 1483 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez1.q.out
Line 1636 (original), 1667 (patched)


These tests are testing for specific join algo, changing the plan changes 
the test. Shall turn off auto gather stats to keep original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out
Line 124 (original), 124 (patched)


These tests are testing for specific join algo, changing the plan changes 
the test. Shall turn off auto gather stats to keep original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out
Line 257 (original), 264 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out
Line 560 (original), 574-575 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out
Line 647 (original), 662-663 (patched)


Turn off autogather to restore original intent of test.



ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out
Line 1635 (original), 1646-1647 (patched)


No desc of these stats work?



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_1.q.out
Lines 1006-1007 (original), 1006-1007 (patched)


turn off stats autogather to restore test



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_1.q.out
Lines 1127-1129 (original), 1133-1135 (patched)


turn off stats autogather to restore test



ql/src/test/results/clientpositive/llap/multiMapJoin1.q.out
Lines 1490-1491 (original), 1490-1493 (patched)


turn off stats autogather to restore test



ql/src/test/results/clientpositive/llap/tez_smb_main.q.out
Lines 587-588 (original), 587-588 (patched)


turn off stats autogather to restore test



ql/src/test/results/clientpositive/llap/vector_char_simple.q.out
Line 268 (original), 269 (patched)


lets turn off statsgather so that test retains original intetion



ql/src/test/results/clientpositive/llap/vector_groupby_rollup1.q.out
Line 503 (original), 505 (patched)


set autogather=false to restore original test



ql/src/test/results/clientpositive/llap/vector_multi_insert.q.out
Line 147 (original), 178 (patched)


set autogather=false to restore original test



ql/src/test/results/clientpositive/llap/vector_udf_character_length.q.out
Line 73 (original), 76 (patched)


set autogather=false to restore original test



ql/src/test/results/clientpositive/llap/vector_udf_octet_length.q.out
Line 56 (original), 59 (patched)


set autogather=false to restore original test



ql/src/test/results/clientpositive/llap/vector_varchar_4.q.out
Line 173 (original), 165 (patched)


set autogather=false to restore original test



ql/src/test/results/clientpositive/llap/vector_varchar_simple.q.out
Line 355 (original), 355 (patched)


set autogather=false to restore original test


- Ashutosh Chauhan


On June 20, 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-28 Thread pengcheng xiong


> On June 27, 2017, 6:37 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_13.q.out
> > Line 313 (original), 393 (patched)
> > 
> >
> > Auto convert to map join failed.

plan in new patch looks good.


> On June 27, 2017, 6:37 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/vector_multi_insert.q.out
> > Line 165 (original), 198 (patched)
> > 
> >
> > vectorization turned off.

notVectorizedReason: Aggregation Function expression for GROUPBY operator: UDF 
compute_stats not supported


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178954
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bda94ff765 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178954
---




ql/src/test/results/clientpositive/llap/auto_sortmerge_join_13.q.out
Line 313 (original), 393 (patched)


Auto convert to map join failed.



ql/src/test/results/clientpositive/llap/auto_sortmerge_join_13.q.out
Line 529 (original), 664 (patched)


map join conversion failed.



ql/src/test/results/clientpositive/llap/auto_sortmerge_join_6.q.out
Lines 86-87 (original), 86-88 (patched)


change in plans. Expected?



ql/src/test/results/clientpositive/llap/auto_sortmerge_join_6.q.out
Lines 213-214 (original), 232-234 (patched)


change in plans. Expected?



ql/src/test/results/clientpositive/llap/auto_sortmerge_join_6.q.out
Lines 698-699 (original), 774-776 (patched)


change in plans. Expected?



ql/src/test/results/clientpositive/vector_char_4.q.out
Lines 157-161 (original)


Turning off vectorization will be huge perf loss.



ql/src/test/results/clientpositive/vector_multi_insert.q.out
Line 165 (original), 198 (patched)


vectorization turned off.


- Ashutosh Chauhan


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-26 Thread pengcheng xiong


> On June 27, 2017, 2:20 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap_acid.q.out
> > Line 94 (original), 94 (patched)
> > 
> >
> > Column stats and basic stats should be complete.

why it is complete? orc_llap is an acid table.


> On June 27, 2017, 2:20 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/merge3.q.out
> > Line 181 (original), 181-182 (patched)
> > 
> >
> > Plan is modified to collect stats. But no column stats desc in explain.

No, column stats is not collect for CTAS.


> On June 27, 2017, 2:20 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/stats3.q.out
> > Lines 58 (patched)
> > 
> >
> > This should say invalidating stats.

filed a jira for this.


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178941
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-26 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178941
---




ql/src/test/results/clientpositive/join2.q.out
Line 132 (original), 132-133 (patched)


No column stats collection?



ql/src/test/results/clientpositive/llap_acid.q.out
Line 94 (original), 94 (patched)


Column stats and basic stats should be complete.



ql/src/test/results/clientpositive/llap_acid.q.out
Lines 162-163 (patched)


Result set changed. Correctness issue.



ql/src/test/results/clientpositive/llap_acid.q.out
Lines 273-277 (patched)


Result set changed. Correctness issue.



ql/src/test/results/clientpositive/merge3.q.out
Line 181 (original), 181-182 (patched)


Plan is modified to collect stats. But no column stats desc in explain.



ql/src/test/results/clientpositive/merge3.q.out
Lines 4817-4819 (original), 4877-4880 (patched)


Any reason for this change in plan?



ql/src/test/results/clientpositive/metadata_only_queries.q.out
Lines 186-187 (original), 186 (patched)


set autogather=false for these tests.



ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out
Line 124 (original), 124 (patched)


set autogather=false for these tests.



ql/src/test/results/clientpositive/outer_reference_windowed.q.out
Line 132 (original), 138 (patched)


Column Stats state should be complete.



ql/src/test/results/clientpositive/ppd_join5.q.out
Line 71 (original), 70 (patched)


Join order has changed from ((a join b) join c) to ((a join c) join b). Any 
reason for that?



ql/src/test/results/clientpositive/ppd_join5.q.out
Lines 186-188 (original), 187 (patched)


Join order changed.



ql/src/test/results/clientpositive/stats3.q.out
Lines 58 (patched)


This should say invalidating stats.


- Ashutosh Chauhan


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-26 Thread pengcheng xiong


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/correlationoptimizer5.q.out
> > Line 386 (original)
> > 
> >
> > No Mux or Demux operator in plan anymore? Seems like correlation 
> > optimizer is turned off. Expected?

-- Currently, a query with multiple FileSinkOperators are not supported.


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/groupby_multi_single_reducer.q.out
> > Line 479 (original)
> > 
> >
> > TopN optimization disabled.

due to L137 in LimitPushDownOptimization "// Not safe to continue for 
RS-GBY-GBY-LIM kind of pipelines. See HIVE-10607 for more." I think we may 
disable the autogather for this q test.


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178843
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-24 Thread pengcheng xiong


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/constprog_type.q.out
> > Line 70 (original), 70 (patched)
> > 
> >
> > No info about Column Stats desc (Column name, type and table name)?

We do not support the stats merging for date type yet. Thus no auto column 
stats gather for date type. Open a new jira to track this.


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/ctas.q.out
> > Line 101 (original), 101 (patched)
> > 
> >
> > No Column Stats Desc?

We do not support auto gather for CTAS yet.


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/groupby6.q.out
> > Lines 22-23 (patched)
> > 
> >
> > Why do we need two jobs in this case to compute column stats?

because of the setting "set hive.map.aggr=false;
set hive.groupby.skewindata=true;"


> On June 24, 2017, 7:13 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/input4.q.out
> > Line 31 (original), 31 (patched)
> > 
> >
> > No ColumnStatsDesc in explain.

This is a load command, which should not trigger column stats autogather.


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178843
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-24 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178843
---




ql/src/test/results/clientpositive/constprog_type.q.out
Line 70 (original), 70 (patched)


No info about Column Stats desc (Column name, type and table name)?



ql/src/test/results/clientpositive/correlationoptimizer5.q.out
Line 386 (original)


No Mux or Demux operator in plan anymore? Seems like correlation optimizer 
is turned off. Expected?



ql/src/test/results/clientpositive/ctas.q.out
Line 101 (original), 101 (patched)


No Column Stats Desc?



ql/src/test/results/clientpositive/groupby1_limit.q.out
Line 43 (original)


limit pushdown optimization got turned off. This can result in perf loss.



ql/src/test/results/clientpositive/groupby6.q.out
Lines 22-23 (patched)


Why do we need two jobs in this case to compute column stats?



ql/src/test/results/clientpositive/groupby_multi_single_reducer.q.out
Line 479 (original)


TopN optimization disabled.



ql/src/test/results/clientpositive/infer_bucket_sort_convert_join.q.out
Lines 108-110 (original), 106-108 (patched)


Bucketing got disabled.



ql/src/test/results/clientpositive/input11_limit.q.out
Line 42 (original)


TopN optimization disabled.



ql/src/test/results/clientpositive/input14_limit.q.out
Line 56 (original)


TopN limit optimization off.



ql/src/test/results/clientpositive/input4.q.out
Line 31 (original), 31 (patched)


No ColumnStatsDesc in explain.


- Ashutosh Chauhan


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-24 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178849
---




ql/src/test/results/clientpositive/autoColumnStats_4.q.out
Line 200 (original)


yes, no auto stats for acid tables. Right now we do not even merge stats 
for normal tables when the old stats is inaccurate. Last time when we 
discussed, we assumed that the old stats is enough (i.e., we do not wipe it 
clean).


- pengcheng xiong


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bda94ff765 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
> b6d7ee8a92 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 9e84a29470 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 08a8f00e06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
>  52af3af2ea 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-24 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178848
---




ql/src/test/queries/clientpositive/smb_join_partition_key.q
Lines 1 (patched)


I think i showed you this issue long time ago. In derby, when it retrieves 
partition with decimal, it will use partval = 100.0, rather than 100. As a 
result, the partition will not be found and it throws exception. If you use 
mysql, we do not have this problem.


- pengcheng xiong


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bda94ff765 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
> b6d7ee8a92 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 9e84a29470 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 08a8f00e06 
>   
> 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-24 Thread Ashutosh Chauhan


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/smb_join_partition_key.q
> > Lines 1 (patched)
> > 
> >
> > decimals should be supported.
> 
> pengcheng xiong wrote:
> we do not support retrieve of partitions in decimal.

Can you expand on this? We certainly support decimal as type of prtition column.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_4.q.out
> > Line 200 (original)
> > 
> >
> > Dont we store stats for acid tables?
> 
> pengcheng xiong wrote:
> true. the only way is to use analyze statement.

So, no auto-update of basic and column stats for acid table? 
We can't mark it as accurate, but we still can collect and update stats.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178583
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-22 Thread pengcheng xiong


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out
> > Line 252 (original), 264 (patched)
> > 
> >
> > Is this change expected. State of basic state changed from Complete to 
> > Partial.

It should be complete. please see the new patch.


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178583
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bda94ff765 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
> b6d7ee8a92 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 9e84a29470 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 08a8f00e06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
>  

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-22 Thread pengcheng xiong


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > accumulo-handler/src/test/results/positive/accumulo_queries.q.out
> > Lines 63 (patched)
> > 
> >
> > No basic stats work?

true. should be no stats task and no column stats task for non-native tables.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > accumulo-handler/src/test/results/positive/accumulo_queries.q.out
> > Lines 559 (patched)
> > 
> >
> > There should be a basic stats work also, no? Since column stats task 
> > also collects basic stats.

we do not do auto gather stats for non-native tables, e.g., accumulo or hbase. 
Please see new patch.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
> > Lines 1939-1946 (patched)
> > 
> >
> > Change this to assert, instead?

sure.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
> > Lines 399-401 (patched)
> > 
> >
> > This doesn't look efficient. Retrieving all partition objects on client 
> > just to determine whether stats merging is needed. 
> > This logic should execute on metastore side.

it will only retrieve only a specific partition and do it one by one. however 
,i think your comment is valid.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
> > Lines 480-492 (patched)
> > 
> >
> > Goal of merging two tasks was to minimize metastore calls which won't 
> > happen as its done right now.
> > 
> > Further, this is confusing. Creating and executing a task within 
> > another task.

we can do refactoring later.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java
> > Line 32 (original), 31 (patched)
> > 
> >
> > Any reason to remove @Explain annotation?

because it will show duplicate "Stats-Aggr Operator"


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java
> > Line 32 (original), 32 (patched)
> > 
> >
> > Any reason to remove @Explain annotation?

yes, otherwise it will show duplicate Stats-Aggr Operator


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientnegative/stats_aggregator_error_1.q
> > Lines 13 (patched)
> > 
> >
> > Any reason for this?

there is a bug and i fixed it.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/combine1.q
> > Lines 10 (patched)
> > 
> >
> > Any reason for this?

due to the compression. this is a corner case


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/exec_parallel_column_stats.q
> > Lines 3-5 (original), 3-5 (patched)
> > 
> >
> > Any reason for this?

we can not compute basic stats for src in q tests.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/orc_wide_table.q
> > Lines 2 (patched)
> > 
> >
> > Any reason for this?

limitation of HMS for too many columns


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/smb_join_partition_key.q
> > Lines 1 (patched)
> > 
> >
> > decimals should be supported.

we do not support retrieve of partitions in decimal.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/udf_round_2.q
> > Lines 2 (patched)
> > 
> >
> > Any reason for this?
> 
> pengcheng xiong wrote:
> We can not store NaN for column stats in metastore.

we can not store NaN in column stats.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/autoColumnStats_4.q.out
> > Line 200 (original)
> > 
> >
> > Dont we store stats for acid tables?

true. the only way is to use 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-22 Thread pengcheng xiong


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/udf_round_2.q
> > Lines 2 (patched)
> > 
> >
> > Any reason for this?

We can not store NaN for column stats in metastore.


> On June 22, 2017, 7:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/columnstats_partlvl.q.out
> > Line 311 (original), 320-321 (patched)
> > 
> >
> > Surprised this didn't happen as part of HIVE-15903 but is happening 
> > now. Expected?

Simply because we do not support basic stats collection in MR. 15903 only 
supports Tez.


- pengcheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178583
---


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 7a0d4a752e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bda94ff765 

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178583
---




accumulo-handler/src/test/results/positive/accumulo_queries.q.out
Lines 63 (patched)


No basic stats work?



accumulo-handler/src/test/results/positive/accumulo_queries.q.out
Lines 559 (patched)


There should be a basic stats work also, no? Since column stats task also 
collects basic stats.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Lines 1939-1946 (patched)


Change this to assert, instead?



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
Lines 399-401 (patched)


This doesn't look efficient. Retrieving all partition objects on client 
just to determine whether stats merging is needed. 
This logic should execute on metastore side.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
Lines 480-492 (patched)


Goal of merging two tasks was to minimize metastore calls which won't 
happen as its done right now.

Further, this is confusing. Creating and executing a task within another 
task.



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java
Line 32 (original), 31 (patched)


Any reason to remove @Explain annotation?



ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java
Line 32 (original), 32 (patched)


Any reason to remove @Explain annotation?



ql/src/test/queries/clientnegative/stats_aggregator_error_1.q
Lines 13 (patched)


Any reason for this?



ql/src/test/queries/clientpositive/combine1.q
Lines 10 (patched)


Any reason for this?



ql/src/test/queries/clientpositive/exec_parallel_column_stats.q
Lines 3-5 (original), 3-5 (patched)


Any reason for this?



ql/src/test/queries/clientpositive/orc_wide_table.q
Lines 2 (patched)


Any reason for this?



ql/src/test/queries/clientpositive/smb_join_partition_key.q
Lines 1 (patched)


decimals should be supported.



ql/src/test/queries/clientpositive/udf_round_2.q
Lines 2 (patched)


Any reason for this?



ql/src/test/results/clientpositive/autoColumnStats_4.q.out
Line 200 (original)


Dont we store stats for acid tables?



ql/src/test/results/clientpositive/columnstats_partlvl.q.out
Line 311 (original), 320-321 (patched)


Surprised this didn't happen as part of HIVE-15903 but is happening now. 
Expected?



ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out
Line 252 (original), 264 (patched)


Is this change expected. State of basic state changed from Complete to 
Partial.



ql/src/test/results/clientpositive/columnstats_tbllvl.q.out
Line 109 (original), 110-111 (patched)


Another one: HIVE-15903 should have made this happen.


- Ashutosh Chauhan


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-20 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/#review178430
---




ql/src/test/results/clientpositive/autoColumnStats_3.q.out
Line 39 (original), 41 (patched)


Note that we only compute stats for key but both key and value are accurate 
although value is empty (inherited from create table.)



ql/src/test/results/clientpositive/autoColumnStats_3.q.out
Line 210 (original), 212 (patched)


Due to newly created partition.



ql/src/test/results/clientpositive/autoColumnStats_5.q.out
Line 414 (original), 408 (patched)


part=2 is an empty partition



ql/src/test/results/clientpositive/autoColumnStats_5.q.out
Line 606 (original), 597 (patched)


part=1 already contains data. the new columns c and d should not be merged 
as their stats is inaccurate.


- pengcheng xiong


On June 20, 2017, 10 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57614/
> ---
> 
> (Updated June 20, 2017, 10 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13567
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> de82857c25 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  6621a4e204 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 7c27d07024 
>   common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
> 799355a971 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
>   contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
>   contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
>   contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
>   contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
>   contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
>   data/conf/hive-site.xml 62364fe4ea 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  68a417d0c1 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  e55b1c257e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 663a572748 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  6e95fd123c 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  660cebba5f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  8052fd86ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  2ababb1eec 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
> ad2baa2e26 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 4a9af80fdc 
>   itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1aaba4ca01 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> e13612ee97 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4642ec2faa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
> 9297a0b874 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 88bf82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
> 3a20cfe7ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
> dc433fed22 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> e9a4ff0748 
>   

Re: Review Request 57614: Auto-gather column stats - phase 2

2017-06-20 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57614/
---

(Updated June 20, 2017, 10 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

This patch also includes (1) HIVE-16495 ColumnStats merge should consider the 
accuracy of the current stats (2) HIVE-16827 Merge stats task and column stats 
task into a single task. After the change, for all the execution engines, if we 
collect column stats, it will automatically collect basic stats as well.


Repository: hive-git


Description
---

HIVE-13567


Diffs (updated)
-

  accumulo-handler/src/test/results/positive/accumulo_queries.q.out de82857c25 
  
accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
 6621a4e204 
  common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 7c27d07024 
  common/src/java/org/apache/hadoop/hive/common/jsonexplain/Vertex.java 
799355a971 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8bdefdad6 
  contrib/src/test/results/clientpositive/serde_typedbytes.q.out 6876ca8775 
  contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 79cf8fe1e5 
  contrib/src/test/results/clientpositive/serde_typedbytes3.q.out fec58ef026 
  contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1131478a7b 
  contrib/src/test/results/clientpositive/serde_typedbytes5.q.out 8d3b95ece8 
  data/conf/hive-site.xml 62364fe4ea 
  
hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out 
68a417d0c1 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 e55b1c257e 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
663a572748 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
 6e95fd123c 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 660cebba5f 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 8052fd86ee 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 2ababb1eec 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
ad2baa2e26 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
4a9af80fdc 
  itests/src/test/resources/testconfiguration.properties 07fd5bfe48 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1aaba4ca01 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
e13612ee97 
  
metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
 fe890e4e27 
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java f43992c85d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java d96f432fee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java f329b5111b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsNoJobTask.java 3807f434a7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java c22d69bb19 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java d61a4607ea 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 88c73f090b 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
4642ec2faa 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
9297a0b874 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
88bf82 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java 
3a20cfe7ac 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SerializeFilter.java 
dc433fed22 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
e9a4ff0748 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
7a0d4a752e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java ca544b4549 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
bda94ff765 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ProcessAnalyzeTable.java 
b6d7ee8a92 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 9e84a29470 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 08a8f00e06 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java 
52af3af2ea 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java 97f323f4b7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java 76811b1a93 
  ql/src/java/org/apache/hadoop/hive/ql/plan/StatsNoJobWork.java 77c04f6c6e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java a5050c5368 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 7c66955e14 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 5786c4f659 
  ql/src/test/queries/clientnegative/stats_aggregator_error_1.q 1b2872d3d7