I'm using the CDH3 Beta Version of the Cloudera Distribution of Hadoop. This version includes Pig 0.5.0. Is there no possibility in Pig 0.5.0 to get the aliases of the tuple fields in an evaluation function? In my opinion this is a very important thing!
Alex ________________________________ Von: Alan Gates <[email protected]> An: [email protected] Gesendet: Montag, den 24. Mai 2010, 18:26:22 Uhr Betreff: Re: Aliases in EvalFunction UDFs are not serialized from the frontend to the backend, so simply saving the schema into a variable will not work. What version of Pig are you using? In 0.6 and later the UDFContext class exists to allow UDFs to carry information like this from the frontend to the backend. Alan. On May 22, 2010, at 9:10 AM, Alexander Schätzle wrote: > Hi all, > > in the exec(Tuple input) method of an EvalFunc I get the tuple to be > processed by the Eval Function. > Is there any secure possibility to get the Aliases of the fields in the input > Tuple? > My Eval Function needs to know the names of the Field-Aliases in order to > know how to evaluate the tuple. > > My current solution is to save the Schema which I get in the > outputSchema(Schema input) method in a static instance variable of the Eval > Function Class so that I can access the Schema-Information in the exec-method: > > private static Schema inputSchema; > public Schema outputSchema(Schema input) { > inputSchema = input; > return input; > } > > public Tuple exec(Tuple input) throws IOException { > Set<String> aliases = inputSchema.getAliases(); > ... > return input; > } > > Local tests worked but I'm not sure if it is guaranteed to work? > > Example: > > Describe A; > A: {varName1: chararray, varName2: chararray} > > B = Foreach A Generate MyUDF(*); > > > In MyUDF I have to know the name of the first and second field (varName1 and > varName2) because the processing is dependant on the varNames. > Any suggestions? > > Thx in advance, > Alex >
