Re: FYI: MetaStore running out of threads
>> In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use create UGI once to reduce the impact (suspecting this will have 50% impact). Looked closely at the method impl for "FileUtils.checkFileAccessWithImpersonation". It doesn't make 2 connections; 50% impact may not be relevant here. On Thu, Sep 1, 2022 at 4:48 AM Rajesh Balamohan wrote: > > W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS) > https://issues.apache.org/jira/browse/HIVE-16020. It was making a > connection in every task and UGI had to be persisted in the QueryInfo level > to reduce the impact. > > In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use > create UGI once to reduce the impact (suspecting this will have 50% > impact). > > > https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418 > > https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461 > > May have to explore whether a local cache with expiry in FileUtils can > help reduce the impact further. > > ~Rajesh.B > > > On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley > wrote: > >> We're using HMS with Storage-Based Authorization and have been having >> trouble with the HMS running out of threads. Looking at the jstack & code, >> it appears to that the problem is that RPC's ConnectionId is using UGI's >> equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's >> always create a new Subject and thus are always unique. >> >> This leads to the HMS creating too many threads. I've created a jira in >> Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 >> >> Thanks, >>Owen >> >
Re: FYI: MetaStore running out of threads
W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS) https://issues.apache.org/jira/browse/HIVE-16020. It was making a connection in every task and UGI had to be persisted in the QueryInfo level to reduce the impact. In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use create UGI once to reduce the impact (suspecting this will have 50% impact). https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418 https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461 May have to explore whether a local cache with expiry in FileUtils can help reduce the impact further. ~Rajesh.B On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley wrote: > We're using HMS with Storage-Based Authorization and have been having > trouble with the HMS running out of threads. Looking at the jstack & code, > it appears to that the problem is that RPC's ConnectionId is using UGI's > equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's > always create a new Subject and thus are always unique. > > This leads to the HMS creating too many threads. I've created a jira in > Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 > > Thanks, >Owen >
FYI: MetaStore running out of threads
We're using HMS with Storage-Based Authorization and have been having trouble with the HMS running out of threads. Looking at the jstack & code, it appears to that the problem is that RPC's ConnectionId is using UGI's equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's always create a new Subject and thus are always unique. This leads to the HMS creating too many threads. I've created a jira in Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 Thanks, Owen