[ https://issues.apache.org/jira/browse/PIG-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764377#comment-15764377 ]
Adam Szita commented on PIG-4913: --------------------------------- I propose (see: [^PIG-4913.patch]) we cache the script content and use it whenever required instead of reading it each time with *ScriptEngine#getScriptAsStream*. As I can see there was already a similar idea not fully implemented with the use of *JythonScriptEngine$Interpreter#filesLoaded* so I basically extended this notion. Let me know what you think [~rohini], if you agree with the approach we could look at the above mentioned issue here next. > Reduce jython function initiation during compilation > ---------------------------------------------------- > > Key: PIG-4913 > URL: https://issues.apache.org/jira/browse/PIG-4913 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Adam Szita > Attachments: PIG-4913.patch > > > While investigating PIG-4908, saw that ScriptEngine.getScriptAsStream was > invoked way too many times during compilation phase for a simple script. > {code:title=sleep.py} > #!/usr/bin/python > import time; > @outputSchema("sltime:int") > def sleep(num): > if num == 1: > print "Sleeping for %d minutes" % num; > time.sleep(num * 60); > return num; > {code} > {code:title=sleep.pig} > register 'sleep.py' using jython; > A = LOAD '/tmp/sleepdata' as (f1:int); > B = FOREACH A generate $0, sleep($0); > STORE B into '/tmp/tezout'; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)