[ https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446630#comment-16446630 ]
Koji Noguchi commented on PIG-5338: ----------------------------------- Thanks Greg, Adam. bq. although we'll also need to run (Scripting) e2e tests for verification. Good idea. Blindly running e2e with the patch, getting two failures. Scripting.Scripting_5 and Scripting.Scripting_9 Pasting the error message. {noformat} 2018-04-20 18:38:51,316 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backed error: Error: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: c: New For Each(false,false,false)[bag] - scope-21 Operator Key: scope-21): org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function] at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:260) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1949) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function] at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:358) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:369) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:359) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) ... 12 more Caused by: java.io.IOException: Error executing function at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330) ... 17 more Caused by: com.google.inject.ConfigurationException: Guice configuration errors: 1) Unable to method intercept: org.apache.pig.scripting.jython.JythonBag while locating org.apache.pig.scripting.jython.JythonBag 1 error at com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:1004) at com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:961) at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1013) at org.apache.pig.scripting.jython.JythonUtils.pigToPython(JythonUtils.java:133) at org.apache.pig.scripting.jython.JythonUtils.pigTupleToPyTuple(JythonUtils.java:153) at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:116) ... 18 more Caused by: java.lang.IllegalArgumentException: Cannot subclass final class class org.apache.pig.scripting.jython.JythonBag at com.google.inject.internal.cglib.proxy.$Enhancer.generateClass(Enhancer.java:446) at com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25) at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:216) at com.google.inject.internal.cglib.proxy.$Enhancer.createHelper(Enhancer.java:377) at com.google.inject.internal.cglib.proxy.$Enhancer.createClass(Enhancer.java:317) at com.google.inject.internal.ProxyFactory$ProxyConstructor._init_(ProxyFactory.java:246) at com.google.inject.internal.ProxyFactory.create(ProxyFactory.java:172) at com.google.inject.internal.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:89) at com.google.inject.internal.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:28) at com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:36) at com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:32) at com.google.inject.internal.FailableCache$1.apply(FailableCache.java:39) at com.google.inject.internal.util.$MapMaker$StrategyImpl.compute(MapMaker.java:549) at com.google.inject.internal.util.$MapMaker$StrategyImpl.compute(MapMaker.java:419) at com.google.inject.internal.util.$CustomConcurrentHashMap$ComputingImpl.get(CustomConcurrentHashMap.java:2041) at com.google.inject.internal.FailableCache.get(FailableCache.java:50) at com.google.inject.internal.ConstructorInjectorStore.get(ConstructorInjectorStore.java:49) at com.google.inject.internal.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:125) at com.google.inject.internal.InjectorImpl.initializeJitBinding(InjectorImpl.java:521) at com.google.inject.internal.InjectorImpl.createJustInTimeBinding(InjectorImpl.java:847) at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive(InjectorImpl.java:772) at com.google.inject.internal.InjectorImpl.getJustInTimeBinding(InjectorImpl.java:256) at com.google.inject.internal.InjectorImpl.getBindingOrThrow(InjectorImpl.java:205) at com.google.inject.internal.InjectorImpl.getInternalFactory(InjectorImpl.java:853) at com.google.inject.internal.InjectorImpl.getProviderOrThrow(InjectorImpl.java:967) at com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:1000) ... 23 more {noformat} > Prevent deep copy of DataBag into Jython List > --------------------------------------------- > > Key: PIG-5338 > URL: https://issues.apache.org/jira/browse/PIG-5338 > Project: Pig > Issue Type: Improvement > Reporter: Greg Phillips > Assignee: Greg Phillips > Priority: Major > Attachments: PIG-5338.patch > > > Pig Python UDFs currently perform deep copies on Bags converting them into > Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A > Jython DataBag which extends PyList could allow for iterative access to > DataBag elements, while only performing a deep copy when necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)