The error message is misleading. The user expected 'day' to be the alias used 
for the UDF and not an alias in the schema.

-----Original Message-----
From: Jonathan Coveney [mailto:[email protected]] 
Sent: Tuesday, February 01, 2011 6:22 AM
To: [email protected]
Subject: Re: UDF with parameterized constructor in DEFINE statement

Ther error, at least following what you posted, is different from what you 
think. The problem is simply that the column "day" doesn't exist. You can see 
in the output that the columns are {ex_time:
chararray,scBytes: long,fSize: long}. If you want it to be called day, you can 
name it as such with an "as day" or you can channge the schema or you can just 
group by extime. In generral if you get a parsing error that comes before 
errors with the udf itself, as it will try and parse the whole thing THEN make 
the job

Sent via BlackBerry

-----Original Message-----
From: Charles Gonçalves <[email protected]>
Date: Tue, 1 Feb 2011 12:12:30
To: <[email protected]>
Reply-To: [email protected]
Subject: UDF with parameterized constructor in DEFINE statement

Hi Guys,

I'm Have an UDF in which I want to pass a long in a timestamp representation 
and get an Date formated with the SimpleDateFormat Class.
I will pass to the UDF constructor  the string format to the sdf object, and 
eventualy the timezone if needed.

So I made a class to do that but when I use it on my script I got the error:

ERROR 1000: Error during parsing. Invalid alias: day in {ex_time:
chararray,scBytes: long,fSize: long}
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid
alias: day in {ex_time: chararray,scBytes: long,fSize: long}..

What is the best way to parameterize a java UDF ?
What I'm doing wrong?

Thanks!

THE script:

REGISTER MscPigUtils.jar
DEFINE EdgeLoader msc.pig.EdgeLoader();
DEFINE day msc.pig.ExtractTime('dd');
raw = LOAD
'/home/charles/workspace-j2ee/ReportService/src/test/resources/logsSamples/wpc_justAbril.log.gz'
using EdgeLoader;
B = FOREACH raw GENERATE day(ts), scBytes, fSize ; C = GROUP B BY day; 
clients_stats = FOREACH C { complete_views = FILTER B BY scBytes >= fSize;  
GENERATE FLATTEN(group), COUNT(B), COUNT(complete_views), SUM(B.scBytes); } 
STORE clients_stats into 'dateTransferday';

The Class:

package msc.pig;

import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.TimeZone;

import msc.misc.TimeUtils;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.log4j.Logger;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataType;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.logicalLayer.schema.Schema;
import org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema;

public class ExtractTime extends EvalFunc<String> {  private static final 
Logger logger = Logger.getLogger(ExtractTime.class);
 private static DateFormat utc_df;
 private static Calendar utc_cal;
  public ExtractTime(String format) {
 utc_df =  new SimpleDateFormat(format); 
utc_df.setTimeZone(TimeZone.getTimeZone("UTC"));
 utc_cal = Calendar.getInstance();
 utc_cal.setTimeZone(TimeZone.getTimeZone("UTC"));
}
 public ExtractTime(String format,String tz) {  utc_df =  new 
SimpleDateFormat(format); utc_df.setTimeZone(TimeZone.getTimeZone(tz));
 utc_cal = Calendar.getInstance();
 utc_cal.setTimeZone(TimeZone.getTimeZone(tz));
}

@Override
 public String exec(Tuple input) throws IOException { if (input == null || 
input.size() == 0) {  return null; }  try { Object object = input.get(0);  if 
(object == null) { return null;  } Long ts = ((Long) object);  
utc_cal.setTimeInMillis(ts * 1000);  return utc_df.format(utc_cal.getTime());  
}catch (Exception e) { logger.error("Error Parsing date !!",e);  return null; } 
 } @Override  public Schema outputSchema(Schema input) { return new Schema(new 
Schema.FieldSchema("ex_time", DataType.CHARARRAY));  } }




--
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Reply via email to