Re: Accessing tuple field names from within a python udf

Martin Goodson Fri, 16 Nov 2012 02:22:01 -0800

Unfortunately I've realised that boundscript.describe doesn't return a
string. It returns void but prints to stdout. This means I have to go
through a rather painful process of calling a separate python process that
calls boundscript.describe and then capture the stdout of that process in
order to obtain the schema. I don't know why it doesn't return a string.
Maybe there is an easier way I am missing here. If people have any ideas
for  a more elegant solution I would be happy to contribute develop it and
contribute the code.


Martin







On 15 November 2012 20:20, Jonathan Coveney <[email protected]> wrote:

> Martin,
>
> That is a reasonable workaround. Even in java UDF's, you can't directly
> access fields by name. Tuples are indexed only by numbers. Using the Schema
> is how I would do it.
>
>
> 2012/11/14 Martin Goodson <[email protected]>
>
> > Sorry to reply to my question post but I've found a workaround that I
> > thought I should put here:
> >
> > use embedded pig
> > access the schema with boundscript.describe().
> > input the schema as a parameter into the udf call.
> >
> > Thanks
> > Martin
> >
> >
> >
> >
> > On 14 November 2012 16:17, Martin Goodson <[email protected]>
> > wrote:
> >
> > > I normally deal with very large tuples with many fields. Its a pain to
> > > deal with these in python udfs since I can't figure out a way to input
> > > schemas into the udf. I have to hard code the column number in the
> UDFs,
> > > which is a maintenance nightmare.
> > >
> > > It seems that java UDFs receive the full tuple in their exec methods so
> > > that the correct fields can be identified, whereas python UDFs only
> > receive
> > > lists objects (with field names stripped). Is there any way to get the
> > > behaviour of python UDFs to conform to the java behaviour?
> > >
> > >
> > > Thanks for any ideas
> > > Martin
> > >
> > >
> >
>

Re: Accessing tuple field names from within a python udf

Reply via email to