[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284045#comment-15284045 ] Nemon Lou commented on HIVE-13602: -- Thanks [~pxiong] .It will be nice to provide a patch for branch-1, too. If there will be a branch-1 release in the future . > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > HIVE-13602.04.patch, HIVE-13602.05.patch, HIVE-13602.final.patch, > calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, > explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, > explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, > explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282881#comment-15282881 ] Pengcheng Xiong commented on HIVE-13602: Also thank [~nemon] for discovering this serious bug and continuing investigation! > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > HIVE-13602.04.patch, HIVE-13602.05.patch, calcite_cbo_bad.out, > calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282880#comment-15282880 ] Pengcheng Xiong commented on HIVE-13602: rerun all the failed tests, none of them fail. pushed to master. Thanks [~ashutoshc] and [~aihuaxu] for the review! > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > HIVE-13602.04.patch, HIVE-13602.05.patch, calcite_cbo_bad.out, > calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282660#comment-15282660 ] Hive QA commented on HIVE-13602: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803373/HIVE-13602.05.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 52 failed/errored test(s), 9915 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-ptf_seqfile.q-auto_join18.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join4.q-groupby_cube1.q-auto_join20.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7.q-auto_join17.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_vc.q-input1_limit.q-join16.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapreduce2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_primitive org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_after_multiple_inserts org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_bucket org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partitioned_date_time org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_5 org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge org.apache.hadoop.hive.metastore.TestMetaStoreEndFunctionListener.testEndFunctionListener org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics org.apache.hadoop.hive.metastore.TestRetryingHMSHandler.testRetryingHMSHandler org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest org.apache.hadoop.hive.ql.exec.tez.TestDynamicPartitionPruner.testSingleSourceMultipleFiltersOrdering1 org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testShowLocksFilterOptions org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279583#comment-15279583 ] Pengcheng Xiong commented on HIVE-13602: attached a new patch to address the virtual columns. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > HIVE-13602.04.patch, HIVE-13602.05.patch, calcite_cbo_bad.out, > calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277859#comment-15277859 ] Hive QA commented on HIVE-13602: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12802903/HIVE-13602.04.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 53 failed/errored test(s), 9945 tests executed *Failed tests:* {noformat} TestCliDriver-partition_timestamp.q-ppd_random.q-vector_outer_join5.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-ppd_join4.q-union27.q-show_indexes_edge_cases.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-ptf_general_queries.q-unionDistinct_1.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt3.q-union27.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube_multi_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_duplicate_key org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_window org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_grouping_operators org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_grouping_sets org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_external_table_ppd org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_grouping_sets org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_part_col_char org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query18 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query22 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query67 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query70 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query80 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_grouping_id2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.testGetMetaConfDefault
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275236#comment-15275236 ] Hive QA commented on HIVE-13602: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12802463/HIVE-13602.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/203/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/203/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-203/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-203/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 3f3aa2a HIVE-12827: Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification (errata.txt) + git clean -f -d Removing ql/src/test/queries/clientpositive/multi_insert_with_join2.q Removing ql/src/test/results/clientpositive/multi_insert_with_join2.q.out + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 3f3aa2a HIVE-12827: Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification (errata.txt) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12802463 - PreCommit-HIVE-MASTER-Build > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, > explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, > explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, > explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272740#comment-15272740 ] Pengcheng Xiong commented on HIVE-13602: [~aihuaxu], i submitted a new patch just now and you can see the improvement. I think we still need to wait for another round QA run for the new patch. Thanks! > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, > explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, > explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, > explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272728#comment-15272728 ] Aihua Xu commented on HIVE-13602: - The patch makes sense to me. +1. Not sure if the failed tests are related since it could change the optimization on existing tests. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, > explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, > explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, > explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272701#comment-15272701 ] Pengcheng Xiong commented on HIVE-13602: [~ashutoshc] and [~aihuaxu], could u please kindly review? Thanks. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, > explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, > explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, > explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268689#comment-15268689 ] Aihua Xu commented on HIVE-13602: - [~pxiong] Can you create a RB so it will be easier to review? Thanks. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, calcite_cbo_bad.out, > calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257398#comment-15257398 ] Nemon Lou commented on HIVE-13602: -- It's 24581 on my computer. I must have checked the wrong stages from mapreduce job UI. After set hive.optimize.constant.propagation=false; the result is right: INFO : Table tpch_flat_orc_2.q16_cbo_debug2 stats: [numFiles=1, numRows=24581, totalSize=803640, rawDataSize=786232] > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257234#comment-15257234 ] Pengcheng Xiong commented on HIVE-13602: [~nemon], I tried the following query (I think it should be the same as yours) {code} select p_brand,p_type, p_size,count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#34' and p_type not like 'ECONOMY BRUSHED%' and p_size in (22, 14, 27, 49, 21, 33, 35, 28) and partsupp.ps_suppkey not in (select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size; {code} on Postgres. It returns *24585* rows... > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257042#comment-15257042 ] Pengcheng Xiong commented on HIVE-13602: [~ashutoshc], enable the CBO, set hive.optimize.constant.propagation=false; will give correct result > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256543#comment-15256543 ] Ashutosh Chauhan commented on HIVE-13602: - [~nemon] If HIVE-11104 is the culprit then turning off constant propagation should yield correct results. Did you try turning that off by {{set hive.optimize.constant.propagation=false;}} and running the query? Does that give correct results? > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256275#comment-15256275 ] Nemon Lou commented on HIVE-13602: -- The result is right when using hive 1.2.1. After some digging ,finally find HIVE-11104 which causes column missing. But i don't know how it is influenced. [~pxiong] It is easy to reproduce this bug with the latest branch-1 . TPCH tool :https://github.com/hortonworks/hive-testbench > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.3.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255976#comment-15255976 ] Nemon Lou commented on HIVE-13602: -- I just copy it from the result of "explain rewrite q16". {noformat} explain rewrite select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#34' and p_type not like 'ECONOMY BRUSHED%' and p_size in (22, 14, 27, 49, 21, 33, 35, 28) and partsupp.ps_suppkey not in ( select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size; {noformat} > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255964#comment-15255964 ] Nemon Lou commented on HIVE-13602: -- >From calcite rel plan ,we can see the p_brand column is declared to fetch. And the value for p_brand column is zero in the final result. {noformat} 0: jdbc:hive2://189.39.151.141:21066/> select * from q16_cbo_bad limit 1; ++---+---+-+--+ | q16_cbo_bad.p_brand | q16_cbo_bad.p_type | q16_cbo_bad.p_size | q16_cbo_bad.supplier_cnt | ++---+---+-+--+ | 0 | MEDIUM BRUSHED TIN| 21| 298 | ++---+---+-+--+ 1 row selected (0.139 seconds) 0: jdbc:hive2://189.39.151.141:21066/> select * from q16_cbo_rewrite limit 1; ++---+---+-+--+ | q16_cbo_rewrite.p_brand | q16_cbo_rewrite.p_type | q16_cbo_rewrite.p_size | q16_cbo_rewrite.supplier_cnt | ++---+---+-+--+ | Brand#31 | PROMO ANODIZED NICKEL | 14 | 40 | ++---+---+-+--+ 1 row selected (0.961 seconds) {noformat} > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255953#comment-15255953 ] Pengcheng Xiong commented on HIVE-13602: what is {code} 1 = 1 {code} in the where condition? > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255950#comment-15255950 ] Pengcheng Xiong commented on HIVE-13602: [~nemon], if p_brand is missing, i would expect to see that one column is missing, rather than different number of rows? > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255937#comment-15255937 ] Nemon Lou commented on HIVE-13602: -- The join order seems to be the same for both queries.The difference is that the bad plan only select three columns for table part at stage-1 (p_brand is missing),the good one select four. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: calcite_cbo_bad.out, calcite_cbo_good.out, > explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255921#comment-15255921 ] Pengcheng Xiong commented on HIVE-13602: [~nemon], could u also post the explain plan for the rewrite query? As I can see from the attachement, the original plan when CBO is on is to {code} (join part with supplier cnt=0) join (left outer join partsupp with supplier). {code} I would like to see if the rewrite query follows the same plan. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255900#comment-15255900 ] Pengcheng Xiong commented on HIVE-13602: [~nemon], do you have all the table/column stats for all the tables? Could u attach the explain results for CBO on and off? Thanks. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255897#comment-15255897 ] Nemon Lou commented on HIVE-13602: -- I used MR engine. The attachment is bigger than 50K ,and has been rejected by firewall or something else. > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255892#comment-15255892 ] Pengcheng Xiong commented on HIVE-13602: cc'ing [~ashutoshc] and [~hagleitn]. [~nemon], could u be more specific about the execution engine that you used? MR or Tez? > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 59,616 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)