[ https://issues.apache.org/jira/browse/PIG-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479567#comment-13479567 ]
Cheolsoo Park commented on PIG-2927: ------------------------------------ Although I am no Ruby expert, I think that Jonathan's patch works well. Here is my test. 1) installed a non-trivial rubygem library (rubygem-json) on the client only and confirmed that it is not installed on any datanode on the cluster. {code} /usr/lib/ruby/gems/1.8/gems/json-1.4.6/ {code} 2) wrote a ruby udf that parses json string: {code} require 'rubygems' require 'pigudf' require 'json' class Myudfs < PigUdf outputSchema "result:chararray" def parseJson input result = JSON.parse(input) end end {code} 3) wrote a short pig script that loads a jsonstring and calls my ruby udf: {code} register 'test.rb' using jruby as myfuncs; a = load 'json.txt' using PigStorage() as (i:chararray); b = foreach a generate myfuncs.parseJson(i); dump b; {code} 4) got the expected result as follows: {code:title=input} {"id":1,"nested":{"value1":"first1","next":{"complex_record":{"id":2,"nested":{"value1":"second1","next":null,"value2":"second2"}}},"value2":"first2"}} {code} {code:title=result} ([id#1,nested#{value1=first1, value2=first2, next={complex_record={id=2, nested={value1=second1, value2=second2, next=null}}}}]) {code} Without Jonathan's patch, I get the following error in the front-end as expected: {code} LoadError: no such file to load -- json require at org/jruby/RubyKernel.java:1042 require at file:/home/cheolsoo/pig-ruby/build/ivy/lib/Pig/jruby-complete-1.6.7.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36 (root) at test.rb:3 2012-10-18 17:09:24,323 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. (LoadError) no such file to load -- json {code} I also ran the "Scripting" e2e test cases with the patch on a Hadoop-1.0.x cluster, and they all passed. So it seems good to commit to me. Btw, I wanted to write an e2e test case using rubygems-json, but I realized that rubygems-json is under GPL and can't include in Pig. We should either find another rubygem library that is under the Apache licence or make the test configurable so that it will run only if rubygem-json is installed. Thanks! > SHIP and use JRuby gems in JRuby UDFs > ------------------------------------- > > Key: PIG-2927 > URL: https://issues.apache.org/jira/browse/PIG-2927 > Project: Pig > Issue Type: New Feature > Components: parser > Affects Versions: 0.11 > Environment: JRuby UDFs > Reporter: Russell Jurney > Assignee: Jonathan Coveney > Priority: Minor > Fix For: 0.11 > > Attachments: PIG-2927-0.patch, PIG-2927-1.patch, PIG-2927-2.patch, > PIG-2927-3.patch > > > It would be great to use JRuby gems in JRuby UDFs without installing them on > all machines on the cluster. Some way to SHIP them automatically with the job > would be great. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira