Hi Guys,

I'm Have an UDF in which I want to pass a long in a timestamp representation
and get an Date formated with the SimpleDateFormat Class.
I will pass to the UDF constructor  the string format to the sdf object, and
eventualy the timezone if needed.

So I made a class to do that but when I use it on my script I got the error:

ERROR 1000: Error during parsing. Invalid alias: day in {ex_time:
chararray,scBytes: long,fSize: long}
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid
alias: day in {ex_time: chararray,scBytes: long,fSize: long}..

What is the best way to parameterize a java UDF ?
What I'm doing wrong?

Thanks!

THE script:

REGISTER MscPigUtils.jar
DEFINE EdgeLoader msc.pig.EdgeLoader();
DEFINE day msc.pig.ExtractTime('dd');
raw = LOAD
'/home/charles/workspace-j2ee/ReportService/src/test/resources/logsSamples/wpc_justAbril.log.gz'
using EdgeLoader;
B = FOREACH raw GENERATE day(ts), scBytes, fSize ;
C = GROUP B BY day;
clients_stats = FOREACH C {
complete_views = FILTER B BY scBytes >= fSize;
 GENERATE FLATTEN(group), COUNT(B), COUNT(complete_views), SUM(B.scBytes);
}
STORE clients_stats into 'dateTransferday';

The Class:

package msc.pig;

import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.TimeZone;

import msc.misc.TimeUtils;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.log4j.Logger;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataType;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.logicalLayer.schema.Schema;
import org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema;

public class ExtractTime extends EvalFunc<String> {
 private static final Logger logger = Logger.getLogger(ExtractTime.class);
 private static DateFormat utc_df;
 private static Calendar utc_cal;
  public ExtractTime(String format) {
 utc_df =  new SimpleDateFormat(format);
utc_df.setTimeZone(TimeZone.getTimeZone("UTC"));
 utc_cal = Calendar.getInstance();
 utc_cal.setTimeZone(TimeZone.getTimeZone("UTC"));
}
 public ExtractTime(String format,String tz) {
 utc_df =  new SimpleDateFormat(format);
utc_df.setTimeZone(TimeZone.getTimeZone(tz));
 utc_cal = Calendar.getInstance();
 utc_cal.setTimeZone(TimeZone.getTimeZone(tz));
}

@Override
 public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0) {
 return null;
}
 try {
Object object = input.get(0);
 if (object == null) {
return null;
 }
Long ts = ((Long) object);
 utc_cal.setTimeInMillis(ts * 1000);
 return utc_df.format(utc_cal.getTime());
 }catch (Exception e) {
logger.error("Error Parsing date !!",e);
 return null;
}
 }
@Override
 public Schema outputSchema(Schema input) {
return new Schema(new Schema.FieldSchema("ex_time", DataType.CHARARRAY));
 }
}




-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Reply via email to