Oh and just to be sure, I have tried
@outputSchema("word:chararray")
@outputSchema("x:{t:(word:chararray)}")
as well (the former of which seems to be the "right" one, whenever I can
figure out what is wrong)I've tested my code separately in python and it is fine... 2010/12/28 Jonathan Coveney <[email protected]> > Aniket, I appreciate you taking a look at this. In general, I found the > documentation around outputSchema pretty confusing... for example, in this > example > > @outputSchema("x:{t:(word:chararray)}") > def helloworld(): > return ('Hello, World') > > > Then, in the sample script below that, you have > > @outputSchema("t:(numformat:chararray)") > def commaFormat(num): > return '{:,}'.format(num) > > In this case, you have lost the x:{} (which makes more sense to me. > > Perhaps this is because the latter function is meant to operate on an input > and return a type (t), whereas the hello world function should be able to > stand alone, and thus, has to return a bag? Not sure... > > Besides that, though, I changed my code per your suggestion and tried > > @outputSchema("t:(word:chararray)") > > and still got the error. > > As a note, do I need to import anything in the python script for > outputSchema to work, or should it be fine since pig is grabbing it? > > Once again, I really appreciate your help in the matter. I feel having > people who weren't intimately related to the project have a go at it is how > you make it ultimately more usable and useful...but you have to answer some > annoying questions on the way :P > > Thanks again. > > 2010/12/28 Aniket Mokashi <[email protected]> > > I think decorator used here is incorrect. >> In general, "output:chararray" needs to be schema-string-compatible. Also, >> you are using "outputSchemaFunction", which is used in case you want to >> write a udf that has output schema dependent on input schema (ęg -square) >> and this should have a function with decorator "schemaFunction" (named >> "output" in your case). I think using "outputSchema" decorator would fix >> the problem here. >> >> More details can be found at- >> http://wiki.apache.org/pig/UDFsUsingScriptingLanguages >> >> Thanks, >> Aniket >> >> On Mon, December 27, 2010 4:30 pm, Jonathan Coveney wrote: >> > so I have module.py, and I want to be able to use it in a pig script. It >> > has no special imports or anything. I do have >> > @outputSchemaFunction("output:chararray) >> > >> > >> > In my pig script, I have this >> > >> > >> > register '/my/udf/location/udf.py' using jython as myfunc; >> > >> > is there any reason why this wouldn't work? here is the error I get: >> > >> > 2010-12-27 16:29:41,288 [main] ERROR org.apache.pig.tools.grunt.Grunt - >> > ERROR 2998: Unhandled internal error. org/python/util/PythonInterpreter >> > >> > >> > Not the most instructive error, but is there anything more I need to be >> > doing to be able to use a python UDF? >> > >> > As an aside, are simply python UDF's as efficient as Java ones? I like >> > Python a lot and love the idea of being able to UDF in it, but can use >> > java if necessary. >> > >> >> >> >
