[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503468#comment-16503468 ] Hive QA commented on HIVE-19605: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12926443/HIVE-19605.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14467 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/11559/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11559/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11559/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12926443 - PreCommit-HIVE-Build > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch, HIVE-19605.02.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503404#comment-16503404 ] Hive QA commented on HIVE-19605: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 11s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 9m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11559/dev-support/hive-personality.sh | | git revision | master / 0992d82 | | Default Java | 1.8.0_111 | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-11559/yetus/patch-asflicense-problems.txt | | modules | C: standalone-metastore U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11559/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch, HIVE-19605.02.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500789#comment-16500789 ] Vihang Karajgaonkar commented on HIVE-19605: Tests work for me locally. Resubmitting the patch. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch, HIVE-19605.02.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498963#comment-16498963 ] Hive QA commented on HIVE-19605: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12925786/HIVE-19605.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14443 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.client.TestRuntimeStats.testCleanup[Embedded] (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/11434/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11434/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11434/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12925786 - PreCommit-HIVE-Build > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498946#comment-16498946 ] Hive QA commented on HIVE-19605: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 9m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11434/dev-support/hive-personality.sh | | git revision | master / 4463c2b | | Default Java | 1.8.0_111 | | modules | C: standalone-metastore U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11434/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497085#comment-16497085 ] Yongzhi Chen commented on HIVE-19605: - The patch LGTM +1 > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496046#comment-16496046 ] Vihang Karajgaonkar commented on HIVE-19605: Hi [~ngangam] Can you please take a look? > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495532#comment-16495532 ] Todd Lipcon commented on HIVE-19605: Ah, yea, you're right, thanks. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19605.01.patch > > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495414#comment-16495414 ] Vihang Karajgaonkar commented on HIVE-19605: Hi [~tlipcon] I was looking at this and I noticed that the {{getTableColumnStatistics}} in fact fetches information from {{The TAB_COL_STATS}} using (CAT_NAME, DB_NAME, TBL_NAME, COL_NAME). So the index should also include COL_NAME unless you think otherwise. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489368#comment-16489368 ] Vihang Karajgaonkar commented on HIVE-19605: We need to have the schema scripts for the new branches before I can submit a patch for this. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484687#comment-16484687 ] Todd Lipcon commented on HIVE-19605: Seems the slow query on get_table is due to the "initialization" queries in ensureDbInit(). This is worked around by HIVE-19310. However, the API calls that actually are meant to fetch column stats are still slow and this should be fixed for their sake. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484648#comment-16484648 ] Todd Lipcon commented on HIVE-19605: Upon further inspection it seems the above query is likely generated during the initialization of ObjectStore, not directly within the get_table call. So, any call can end up making this query and generating a big outlier. > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Assignee: Vihang Karajgaonkar >Priority: Major > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484623#comment-16484623 ] Todd Lipcon commented on HIVE-19605: It seems like this table can also be called from a get_table call. Oddly, the query being generated is: SELECT 'org.apache.hadoop.hive.metastore.model.MTableColumnStatistics' AS NUCLEUS_TYPE,`A0`.`AVG_COL_LEN`,`A0`.`COLUMN_NAME`,`A0`.`COLUMN_TYPE`,`A0`.`DB_NAME`,`A0`.`BIG_DECIMAL_HIGH_VALUE`,`A0`.`BIG_DECIMAL_LOW_VALUE`,`A0`.`DOUBLE_HIGH_VALUE`,`A0`.`DOUBLE_LOW_VALUE`,`A0`.`LAST_ANALYZED`,`A0`.`LONG_HIGH_VALUE`,`A0`.`LONG_LOW_VALUE`,`A0`.`MAX_COL_LEN`,`A0`.`NUM_DISTINCTS`,`A0`.`NUM_FALSES`,`A0`.`NUM_NULLS`,`A0`.`NUM_TRUES`,`A0`.`TABLE_NAME`,`A0`.`CS_ID` FROM `TAB_COL_STATS` `A0` WHERE `A0`.`DB_NAME` = ''; (note the empty db_name). Given the lack of index, this takes 450ms on the HMS instance I am testing (if the mysql query cache is disabled) > TAB_COL_STATS table has no index on db/table name > - > > Key: HIVE-19605 > URL: https://issues.apache.org/jira/browse/HIVE-19605 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Todd Lipcon >Priority: Major > > The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, > TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. > This makes those queries take a significant amount of time in large > metastores since they do a full table scan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)