[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656012#comment-16656012 ] Sergey Shelukhin commented on HIVE-19432: - Hmm... [~Rajkumar Singh] [~ashutoshc] what kind of testing was done for this? I'm looking at a log where very large number of DBs is present, and with network/etc. overhead for each individual call per database, instead of using the pattern, it takes 10+ minutes to fetch all tables. Seems like a single patterned call, before this patch, would have been faster. > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, > HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498443#comment-16498443 ] Hive QA commented on HIVE-19432: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12925652/HIVE-19432.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14443 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/11418/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11418/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11418/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12925652 - PreCommit-HIVE-Build > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, > HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498392#comment-16498392 ] Hive QA commented on HIVE-19432: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} service in master has 49 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s{color} | {color:red} service: The patch generated 3 new + 10 unchanged - 4 fixed = 13 total (was 14) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11418/dev-support/hive-personality.sh | | git revision | master / 48d1a6a | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-11418/yetus/diff-checkstyle-service.txt | | modules | C: service U: service | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11418/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, > HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495626#comment-16495626 ] Ashutosh Chauhan commented on HIVE-19432: - +! > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, > HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492187#comment-16492187 ] Vineet Garg commented on HIVE-19432: [~Rajkumar Singh] Can you re-upload the patch and re-run tests? I don't think any of the test failures are related. But since hive now has policy where you need to get +1 from testing infra you need to re-upload the patch to get clean run. > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465139#comment-16465139 ] Hive QA commented on HIVE-19432: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12922159/HIVE-19432.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 14322 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input37] (batchId=78) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_stats] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=183) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill (batchId=241) org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=304) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/10728/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10728/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10728/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 37 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12922159 - PreCommit-HIVE-Build > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465111#comment-16465111 ] Hive QA commented on HIVE-19432: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s{color} | {color:red} service: The patch generated 3 new + 10 unchanged - 4 fixed = 13 total (was 14) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 33s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10728/dev-support/hive-personality.sh | | git revision | master / f8a6c74 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-10728/yetus/diff-checkstyle-service.txt | | modules | C: service U: service | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10728/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
[ https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464931#comment-16464931 ] ASF GitHub Bot commented on HIVE-19432: --- GitHub user rajkrrsingh opened a pull request: https://github.com/apache/hive/pull/341 HIVE-19432 : GetTablesOperation is too slow if the hive has too many … GetTablesOperation is too slow since it does not check for the authorization for databases and try pulling all the tables from all the databases using getTableMeta. for the operation like follows ``` con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); ``` build the getTableMeta call with wildcard * ``` metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajkrrsingh/hive HIVE-19432 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/341.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #341 commit ee684cedcea545a17658be204b3b07e2fdbc56ed Author: Rajkumar singhDate: 2018-05-05T23:11:04Z HIVE-19432 : GetTablesOperation is too slow if the hive has too many databases and tables > HIVE-7575: GetTablesOperation is too slow if the hive has too many databases > and tables > --- > > Key: HIVE-19432 > URL: https://issues.apache.org/jira/browse/HIVE-19432 > Project: Hive > Issue Type: Improvement > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19432.patch > > > GetTablesOperation is too slow since it does not check for the authorization > for databases and try pulling all the tables from all the databases using > getTableMeta. for operation like follows > {code} > con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); > {code} > build the getTableMeta call with wildcard * > {code} > metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)