Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Ok, I guess I'm just not used to these sorts of situations where the dependencies get so hairy 1) for a simply UDF, what dependencies are these that need to be included? 2) Is there a semi-easy way to clean this up? Thanks for your patience. I really am new to the whole dependencies game 2010/12

Re: Using a UDF written in Python

2010-12-29 Thread Dmitriy Ryaboy
All the dependencies have to be on the classpath, including the dependencies' dependencies... D On Wed, Dec 29, 2010 at 3:12 PM, Jonathan Coveney wrote: > Also, just in general, does EVERY UDF we want to load have to be added to > the classpath when you call pig? And just the .jar/.py file, or

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Also, just in general, does EVERY UDF we want to load have to be added to the classpath when you call pig? And just the .jar/.py file, or more than that? 2010/12/29 Jonathan Coveney > Haha gotcha, I am not the greatest at all this package management. I think > we are getting close though... I ad

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Haha gotcha, I am not the greatest at all this package management. I think we are getting close though... I added jython.jar, as well as my test.py file, and here is what I got when I ran it *sys-package-mgr*: processing new jar, '/home/jcoveney/pig-0.8.0/pig.jar' *sys-package-mgr*: processing new

Re: Using a UDF written in Python

2010-12-29 Thread [email protected]
I think you took Dmitriy a bit to litterally ;) you need to put the actual filenames of the jars into PIG_CLASSPATH. If /home/jcoveney/usefulpig/conf:/home/jcoveney/jython is the directory that contains jython.jar (used purely as an example, I'm not certain what the actualy jar name is) then your

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Wait, ignore that error, that was the wrong one. This is it: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/python/util/PythonInterpreter (I had set the classpath incorrectly, to *.* not ***) 2010/12/29 Jonathan Coveney > echo $PIG_CLASSPATH > /home/jcoven

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
echo $PIG_CLASSPATH /home/jcoveney/usefulpig/conf:/home/jcoveney/jython/*** same error 2010-12-29 16:59:29,862 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Could not initialize class org.apache.pig.scripting.jython.JythonScriptEngine$Interpreter :S I rea

Re: Using a UDF written in Python

2010-12-29 Thread Dmitriy Ryaboy
You need to set the classpath to include the literal jar strings, not just the directory that contains them. Try, /home/jcoveney/usefulpig/conf:/home/jcoveney/jython/*** D On Wed, Dec 29, 2010 at 11:32 AM, Jonathan Coveney wrote: > Ok, strangely enough, it won't run locally either... it sees the

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Ok, strangely enough, it won't run locally either... it sees the file, but it's giving me an interpreter not found error, so it must be something else. PIG_CLASSPATH is equal to /home/jcoveney/usefulpig/conf:/home/jcoveney/jython and here is my test script register '/home/jcoveney/udfs/pytest.py'

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
Ah, that might be it... my computer has it and I have it on my path, however, I do not know if the cluster has it... definitely something to look into. thanks. 2010/12/29 [email protected] > try adding the full path to the jar via PIG_CLASSPATH like so: > > export PIG_CLASSPATH=/path/to/jython.

Re: Using a UDF written in Python

2010-12-29 Thread [email protected]
try adding the full path to the jar via PIG_CLASSPATH like so: export PIG_CLASSPATH=/path/to/jython.jar then run pig. Also, I assume your doing your testing on a local machine? if it's on a cluster, you need to make sure jython is on all the worker nodes and classpath is setup properly on all of

Re: Using a UDF written in Python

2010-12-29 Thread Jonathan Coveney
I do have Jython installed and on PATH, but maybe I didn't include it in the right way? Where does it need to be? 2010/12/29 [email protected] > Do you have Jython on your classpath? Currently Jython isn't distributed in > the 0.8.0 release tarball. > > On Mon, Dec 27, 2010 at 7:18 PM, Jonathan

Re: Using a UDF written in Python

2010-12-29 Thread [email protected]
Do you have Jython on your classpath? Currently Jython isn't distributed in the 0.8.0 release tarball. On Mon, Dec 27, 2010 at 7:18 PM, Jonathan Coveney wrote: > Oh and just to be sure, I have tried > @outputSchema("word:chararray") > @outputSchema("x:{t:(word:chararray)}") > as well (the former

Re: Using a UDF written in Python

2010-12-27 Thread Jonathan Coveney
Oh and just to be sure, I have tried @outputSchema("word:chararray") @outputSchema("x:{t:(word:chararray)}") as well (the former of which seems to be the "right" one, whenever I can figure out what is wrong) I've tested my code separately in python and it is fine... 2010/12/28 Jonathan Coveney

Re: Using a UDF written in Python

2010-12-27 Thread Jonathan Coveney
Aniket, I appreciate you taking a look at this. In general, I found the documentation around outputSchema pretty confusing... for example, in this example @outputSchema("x:{t:(word:chararray)}") def helloworld(): return ('Hello, World') Then, in the sample script below that, you have @outputS

Re: Using a UDF written in Python

2010-12-27 Thread Aniket Mokashi
I think decorator used here is incorrect. In general, "output:chararray" needs to be schema-string-compatible. Also, you are using "outputSchemaFunction", which is used in case you want to write a udf that has output schema dependent on input schema (êg -square) and this should have a function with

Using a UDF written in Python

2010-12-27 Thread Jonathan Coveney
so I have module.py, and I want to be able to use it in a pig script. It has no special imports or anything. I do have @outputSchemaFunction("output:chararray) In my pig script, I have this register '/my/udf/location/udf.py' using jython as myfunc; is there any reason why this wouldn't work? her