[ https://issues.apache.org/jira/browse/HIVE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13721584#comment-13721584 ]
Hive QA commented on HIVE-4051: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594511/HIVE-4051.D11805.2.patch {color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 2653 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_and org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_special_char org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2_hadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_decode_name org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input24 org.apache.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8 org.apache.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_updateAccessTime org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_where org.apache.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/206/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/206/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 59 tests failed {noformat} This message is automatically generated. > Hive's metastore suffers from 1+N queries when querying partitions & is slow > ---------------------------------------------------------------------------- > > Key: HIVE-4051 > URL: https://issues.apache.org/jira/browse/HIVE-4051 > Project: Hive > Issue Type: Bug > Components: Clients, Metastore > Environment: RHEL 6.3 / EC2 C1.XL > Reporter: Gopal V > Assignee: Sergey Shelukhin > Attachments: HIVE-4051.D11805.1.patch, HIVE-4051.D11805.2.patch > > > Hive's query client takes a long time to initialize & start planning queries > because of delays in creating all the MTable/MPartition objects. > For a hive db with 1800 partitions, the metastore took 6-7 seconds to > initialize - firing approximately 5900 queries to the mysql database. > Several of those queries fetch exactly one row to create a single object on > the client. > The following 12 queries were repeated for each partition, generating a storm > of SQL queries > {code} > 4 Query SELECT > `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` > FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = > `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 > 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN > `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 > 4 Query SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 > AND THIS.`INTEGER_IDX`>=0 > 4 Query SELECT > `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS > NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND > `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 > 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` > FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = > `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 > 4 Query SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND > THIS.`INTEGER_IDX`>=0 > 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS > NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND > `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 > 4 Query SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE > THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`>=0 > 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS > NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM > `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON > `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` > =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 > 4 Query SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` > =4871 AND `STRING_LIST_ID_KID` IS NOT NULL > 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS > NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN > `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = > `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 > 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM > `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT > (`A0`.`STRING_LIST_ID_KID` IS NULL) > {code} > This data is not detached or cached, so this operation is performed during > every query plan for the partitions, even in the same hive client. > The queries are automatically generated by JDO/DataNucleus which makes it > nearly impossible to rewrite it into a single denormalized join operation & > process it locally. > Attempts to optimize this with JDO fetch-groups did not bear fruit in > improving the query count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira