[ https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajesh Balamohan updated HIVE-21971: ------------------------------------ Description: https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from hadoop's ReflectionUtils constructor cache issue (https://issues.apache.org/jira/browse/HADOOP-10513). However, there are corner cases where hadoop's {{ReflectionUtils}} is in use and this causes gradual build up of memory in HS2. I have observed this in Hive 2.3. But the codepath in master for this has not changed much. Easiest way to repro would be to add a temp function which extends {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. {noformat} CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 'file:///home/test/udf/dummy.jar'; select dummy(); at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353) at org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) {noformat} Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was removed in 2.x. was: https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from hadoop's ReflectionUtils constructor cache issue (https://issues.apache.org/jira/browse/HADOOP-10513). However, there are corner cases where hadoop's {{ReflectionUtils}} is in use and this causes gradual build up of memory in HS2. I have observed this in Hive 2.3. But the codepath in master for this has not changed much. Easiest way to repro would be to add a temp function which extends {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. {noformat} CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 'file:///home/test/udf/dummy.jar'; select dummy(); at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353) at org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) {noformat} Note: Reflection based invocation of hadoop's `ReflectionUtils::clear` was removed in 2.x. > HS2 leaks classload due to `ReflectionUtils::CONSTRUCTOR_CACHE` with > temporary functions + GenericUDF > ----------------------------------------------------------------------------------------------------- > > Key: HIVE-21971 > URL: https://issues.apache.org/jira/browse/HIVE-21971 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 2.3.4 > Reporter: Rajesh Balamohan > Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from > hadoop's ReflectionUtils constructor cache issue > (https://issues.apache.org/jira/browse/HADOOP-10513). > However, there are corner cases where hadoop's {{ReflectionUtils}} is in use > and this causes gradual build up of memory in HS2. > I have observed this in Hive 2.3. But the codepath in master for this has not > changed much. > Easiest way to repro would be to add a temp function which extends > {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would > end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in > turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. > {noformat} > CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR > 'file:///home/test/udf/dummy.jar'; > select dummy(); > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353) > at > org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122) > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983) > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > {noformat} > Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was > removed in 2.x. -- This message was sent by Atlassian JIRA (v7.6.3#76005)