[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513938#comment-16513938 ] Hive QA commented on HIVE-16979: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12927734/HIVE-16979.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14531 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/11803/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11803/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11803/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12927734 - PreCommit-HIVE-Build > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch, HIVE-16979.4.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513873#comment-16513873 ] Hive QA commented on HIVE-16979: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 6s{color} | {color:blue} standalone-metastore in master has 227 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} standalone-metastore: The patch generated 1 new + 32 unchanged - 3 fixed = 33 total (was 35) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 17m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11803/dev-support/hive-personality.sh | | git revision | master / 477f541 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-11803/yetus/diff-checkstyle-standalone-metastore.txt | | modules | C: standalone-metastore U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11803/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch, HIVE-16979.4.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511753#comment-16511753 ] Daniel Dai commented on HIVE-16979: --- Haven't been tested, the new patch use a different cache strategy. UGI object will not be evicted if it is still in use. Another change is UGI object will not be shared across sessions in case UserGroupInformation and FileSystem is not thread safe. Finished session will release UGI object in the cache. UGICache will keep the idle object around until eviction. > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch, HIVE-16979.4.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161807#comment-16161807 ] Tao Li commented on HIVE-16979: --- [~gopalv] Thanks for your previous comments. Regarding the liveness issue you mentioned, I don't think it's a concern, given that we keep the cached UGI alive for 24 hours which should be long enough for the queries to complete. Regarding the code path, all the services (including HS2 and metastore) should benefit from the perf gain as long as they involve TUGIAssumingProcessor. Do you have any other comments? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111242#comment-16111242 ] Tao Li commented on HIVE-16979: --- Good point. It is used by ThriftBinaryCLIService as well as you mentioned, when SASL is used with Kerberos for auth. The original motivation was to improve the latency for remote metastore. Now looks like this change can improve latency for the HS2 as well (but I did not verify the improvement). > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109964#comment-16109964 ] Gopal V commented on HIVE-16979: > A metastore call normally takes less than a second. Oh, I thought that TUGIAssumingProcessor is used by ThriftBinaryCLIService as well. ThriftBinaryCLIService::run() -> KerberosSaslHelper$CLIServiceProcessorFactory::getProcessor() -> HadoopThriftAuthBridge::wrapNonAssumingProcessor() -> new TUGIAssumingProcessor() > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109720#comment-16109720 ] Tao Li commented on HIVE-16979: --- [~gopalv] The original code is creating and closing a UGI instance per metastore request, and our change is caching the UGI 24 hours after last access. A metastore call normally takes less than a second. So 24 hours is long enough to make sure we will not fail any ongoing metastore call. Does it answer your question? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109684#comment-16109684 ] Gopal V commented on HIVE-16979: [~taoli-hwx]: does this fail queries which take > 24hours? Is there something we can do to mark "liveness" from the query progress loop to make sure the FileSystem.closeAllForUgi() -> deleteOnExit doesn't cleanup any directory currently being written to inside the cluster? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107734#comment-16107734 ] Tao Li commented on HIVE-16979: --- All test failures are due to flaky tests tracked in HIVE-15058. [~thejas] Can you please take a look at this change? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090567#comment-16090567 ] Tao Li commented on HIVE-16979: --- [~daijy] Can you please take a look at this optimization? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084860#comment-16084860 ] Tao Li commented on HIVE-16979: --- [~gopalv] Can you please take a look at the patch? Thanks! > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083470#comment-16083470 ] Hive QA commented on HIVE-16979: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876714/HIVE-16979.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10839 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5972/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5972/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5972/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876714 - PreCommit-HIVE-Build > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068871#comment-16068871 ] Hive QA commented on HIVE-16979: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874981/HIVE-16979.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10831 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5832/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5832/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5832/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874981 - PreCommit-HIVE-Build > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067453#comment-16067453 ] Tao Li commented on HIVE-16979: --- [~gopalv] So the fix is to remove the cache size limit and also switch to last access time for expiration. The purpose is to make sure when we are evicting a cache entry, it's not being used for the last 10 min for example. So there will be no read/write conflicts for the UGI. > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067283#comment-16067283 ] Tao Li commented on HIVE-16979: --- [~gopalv] Good point. Will try to figure out how to avoid that. > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065927#comment-16065927 ] Gopal V commented on HIVE-16979: [~taoli-hwx]: Does Guava's cache track the life time of a cache object? Will this code cause a race condition if the cache expires while a query is using this UGI? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065693#comment-16065693 ] Tao Li commented on HIVE-16979: --- cc [~thejas] > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)