Hi Subir, Thanks for asking. In fact, I found out what's the issue and filed a jira: https://issues.apache.org/jira/browse/PIG-2745. Please find details from the jira.
Cheolsoo On Sat, Jun 9, 2012 at 5:42 AM, Subir S <[email protected]> wrote: > can you pls share a snippet on how you are using these udfs? > > On Fri, Jun 8, 2012 at 6:01 AM, Cheolsoo Park <[email protected]> > wrote: > > > Hello, > > > > I checked out branch-0.10, and I am trying to run e2e RubyUDFs tests in > MR > > mode. But I am getting the following error: > > > > java.lang.IllegalStateException: *Could not initialize interpreter (from > > > file system or classpath) with > > > > > > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/ruby/scriptingudfs.rb > > > * > > > at > > > > > > org.apache.pig.scripting.ScriptEngine.getScriptAsStream(ScriptEngine.java:145) > > > at > > > > > > org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFromCache(JrubyScriptEngine.java:104) > > > at > > > > > > org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFunctions(JrubyScriptEngine.java:120) > > > at > > > > > > org.apache.pig.scripting.jruby.JrubyEvalFunc.initialize(JrubyEvalFunc.java:87) > > > at > > > > org.apache.pig.scripting.jruby.JrubyEvalFunc.exec(JrubyEvalFunc.java:103) > > > at > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216) > > > at > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263) > > > at > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:328) > > > > > > Looking at the source code (ScriptEngine.java), I found > > that scriptingudfs.rb should be found via classpath: > > > > if (file.exists()) { > > > try { > > > is = new FileInputStream(file); > > > } catch (FileNotFoundException e) { > > > throw new IllegalStateException("could not find > existing > > > file "+scriptPath, e); > > > } > > > } else { > > > if (file.isAbsolute()) { > > > *is = > > ScriptEngine.class.getResourceAsStream(scriptPath);* > > > } else { > > > is = ScriptEngine.class.getResourceAsStream("/" + > > > scriptPath); > > > } > > > } > > > > > > Now I looked at the Job jar generated by Pig and found that > > scriptingudfs.rb indeed exists in that jar: > > > > cheolsoo@localhost:~/workspace/pig-cheolsoo $jar tvf > > > Job9203441412304345930.jar | grep scriptingudfs.rb > > > 2491 Thu Jun 07 14:42:44 PDT 2012 * > > > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/scriptingudfs.rb* > > > > > > Since scriptingudfs.rb is inside the Job jar, I imagine that > > getResourceAsStream() should be able to find it, but apparently it > doesn't. > > > > I am wondering if anyone was able to run these test in MR mode and could > > provide some pointers to me. Any help would be appreciated! > > > > Thanks, > > Cheolsoo > > > > p.s. The test works fine in local mode, which is not surprising > > since scriptingudfs.rb would be found via file system. I also see a > similar > > issue with e2e Jython tests where Jython scripts are not found with > > following error: > > > > 2012-06-05 22:44:19,491 [main] INFO > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > - Failed! > > > 2012-06-05 22:44:19,513 [main] ERROR > > > org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate > > > exception from backed error: java.io.IOException: Deserialization > error: > > > could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' > > with > > > arguments > > > > > > '[/home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/python/scriptingudf.py, > > > square]' > > > > > >
