can you pls share a snippet on how you are using these udfs?

On Fri, Jun 8, 2012 at 6:01 AM, Cheolsoo Park <[email protected]> wrote:

> Hello,
>
> I checked out branch-0.10, and I am trying to run e2e RubyUDFs tests in MR
> mode. But I am getting the following error:
>
> java.lang.IllegalStateException: *Could not initialize interpreter (from
> > file system or classpath) with
> >
> /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/ruby/scriptingudfs.rb
> > *
> >         at
> >
> org.apache.pig.scripting.ScriptEngine.getScriptAsStream(ScriptEngine.java:145)
> >         at
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFromCache(JrubyScriptEngine.java:104)
> >         at
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFunctions(JrubyScriptEngine.java:120)
> >         at
> >
> org.apache.pig.scripting.jruby.JrubyEvalFunc.initialize(JrubyEvalFunc.java:87)
> >         at
> > org.apache.pig.scripting.jruby.JrubyEvalFunc.exec(JrubyEvalFunc.java:103)
> >         at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
> >         at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
> >         at
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:328)
>
>
> Looking at the source code (ScriptEngine.java), I found
> that scriptingudfs.rb should be found via classpath:
>
>        if (file.exists()) {
> >             try {
> >                 is = new FileInputStream(file);
> >             } catch (FileNotFoundException e) {
> >                 throw new IllegalStateException("could not find existing
> > file "+scriptPath, e);
> >             }
> >         } else {
> >             if (file.isAbsolute()) {
> >                 *is =
> ScriptEngine.class.getResourceAsStream(scriptPath);*
> >             } else {
> >                 is = ScriptEngine.class.getResourceAsStream("/" +
> > scriptPath);
> >             }
> >         }
>
>
> Now I looked at the Job jar generated by Pig and found that
> scriptingudfs.rb indeed exists in that jar:
>
>  cheolsoo@localhost:~/workspace/pig-cheolsoo $jar tvf
> > Job9203441412304345930.jar | grep scriptingudfs.rb
> >   2491 Thu Jun 07 14:42:44 PDT 2012 *
> > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/scriptingudfs.rb*
>
>
> Since scriptingudfs.rb is inside the Job jar, I imagine that
> getResourceAsStream() should be able to find it, but apparently it doesn't.
>
> I am wondering if anyone was able to run these test in MR mode and could
> provide some pointers to me. Any help would be appreciated!
>
> Thanks,
> Cheolsoo
>
> p.s. The test works fine in local mode, which is not surprising
> since scriptingudfs.rb would be found via file system. I also see a similar
> issue with e2e Jython tests where Jython scripts are not found with
> following error:
>
> 2012-06-05 22:44:19,491 [main] INFO
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - Failed!
> > 2012-06-05 22:44:19,513 [main] ERROR
> > org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate
> > exception from backed error: java.io.IOException: Deserialization error:
> > could not instantiate 'org.apache.pig.scripting.jython.JythonFunction'
> with
> > arguments
> >
> '[/home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/python/scriptingudf.py,
> > square]'
> >
>

Reply via email to