[ https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463168#comment-16463168 ]
Rohini Palaniswamy commented on PIG-5338: ----------------------------------------- A really good idea. Patch looks good. Couple of comments JythonBag.java: 1) Can you rename variables getIndex and getIterator to currIndex and iterator. 2) In list___contains__(PyObject o) method, JythonUtils.pythonToPig(o) can be done once outside the loop. 3) getIterator.hasNext() is redundant in get() as we already check size() before. If the size and index calculation actually does not match and hasNext is false, then something is not right and better to get an error in that case. JythonBagInterceptor: 1) If JythonBag does not have the method, you are performing a deep copy. But shouldn't the invocation be on the PyList after that? Why call methods on the PyList only the next time? How does it work. {code} if(!jythonBagMethods.contains(methodInvocation.getMethod())){ 48 pyList = ((JythonBag)methodInvocation.getThis()).toPyList(); 49 } 50 return methodInvocation.proceed(); {code} TestScriptUDF.java: 1) Can you use assertEquals instead of assertTrue and also hardcode value of wordCount instead of computing it every time. {code} Assert.assertTrue(((DataBag) t.get(0)).size() == wordCount); {code} > Prevent deep copy of DataBag into Jython List > --------------------------------------------- > > Key: PIG-5338 > URL: https://issues.apache.org/jira/browse/PIG-5338 > Project: Pig > Issue Type: Improvement > Reporter: Greg Phillips > Assignee: Greg Phillips > Priority: Major > Attachments: PIG-5338.001.patch, PIG-5338.patch > > > Pig Python UDFs currently perform deep copies on Bags converting them into > Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A > Jython DataBag which extends PyList could allow for iterative access to > DataBag elements, while only performing a deep copy when necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)