Dear Serega, I am now using log4j for debugging my UDF. Here is what I found out. For some reason in mapreduce mode my exec function does not get called. The log message in the constructor gets printed onto the console. However, the log message in the exec funciton does not get printed to the console. In local mode all the log messages can be seen on the console.
public Tuple exec(Tuple input) throws IOException { log.info("Into the exec function"); // rest of the code } public CustomFilter () { log.info("Hello World!"); } > From: serega.shey...@gmail.com > Date: Wed, 6 Nov 2013 21:19:10 +0400 > Subject: Re: More on issue with local vs mapreduce mode > To: user@pig.apache.org > > You get 4 empty tuples. > Maybe your UDF parser.customFilter(key,'AAAAA') works differently? Maybe > you use the old version? > You can add print statement to UDF and see what does it accept and what > does produce. > > > 2013/11/6 Sameer Tilak <ssti...@live.com> > > > Dear Serega, > > > > When I run the script in local mode, I get correct o/p stored in > > AU/part-m-000 file. However, when I run it in the mapreduce mode (with i/p > > and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and there > > is nothing in it. > > > > I am not sure whether AU relation somehow does not get realized correctly > > or the problem happens during the storing stage. > > > > Here are some of the console messages: > > > > HadoopVersion PigVersion UserId StartedAt FinishedAt > > Features > > 1.0.3 0.11.1 p529444 2013-11-06 07:14:15 2013-11-06 07:14:40 > > UNKNOWN > > > > Success! > > > > Job Stats (time in seconds): > > JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime > > MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime > > MedianReducetime Alias Feature Outputs > > job_201311011343_0042 1 0 6 6 6 6 0 0 0 0 > > A,AU MULTI_QUERY,MAP_ONLY /scratch/A,/scratch/AU, > > > > Input(s): > > Successfully read 4 records (311082 bytes) from: "/scratch/file.seq" > > > > Output(s): > > Successfully stored 4 records (231 bytes) in: "/scratch/A" > > Successfully stored 4 records (3 bytes) in: "/scratch/AU" > > > > Counters: > > Total records written : 8 > > Total bytes written : 234 > > Spillable Memory Manager spill count : 0 > > Total bags proactively spilled: 0 > > Total records proactively spilled: 0 > > > > Job DAG: > > job_201311011343_0042 > > > > > > 2013-11-06 07:14:40,352 [main] INFO > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Success! > > > > Here is the o/p of following commands: > > > > hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000 > > > > -rw-r--r-- 1 username groupname 4 2013-11-06 07:14 > > /scratch/AU/part-m-00000 > > > > I am not sure whether DUMP will give me correct result, but when I > > replaced store by dump AU in the mapredue mode then I get > > AU as: > > () > > () > > () > > () > > > > > From: serega.shey...@gmail.com > > > Date: Wed, 6 Nov 2013 11:19:03 +0400 > > > Subject: Re: More on issue with local vs mapreduce mode > > > To: user@pig.apache.org > > > > > > "The same script does not work in the mapreduce mode. " > > > What does it mean? > > > > > > > > > 2013/11/6 Sameer Tilak <ssti...@live.com> > > > > > > > Hello, > > > > > > > > My script in the local mode works perfectly. The same script does not > > work > > > > in the mapreduce mode. For the local mode, the o/p is saved in the > > current > > > > directory, where as for the mapreduce mode I use /scrach directory on > > HDFS. > > > > > > > > Local mode: > > > > > > > > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, value: > > > > chararray); > > > > DESCRIBE A; > > > > STORE A into 'A'; > > > > > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > > > STORE AU into 'AU'; > > > > > > > > > > > > Mapreduce mode: > > > > > > > > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: > > chararray, > > > > value: chararray); > > > > DESCRIBE A; > > > > STORE A into '/scratch/A'; > > > > > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > > > STORE AU into '/scratch/AU'; > > > > > > > > Can someone please point me to tools that I can use to debug the > > script in > > > > mapreduce mode? Also, any thoughts on why this might be happening > > would be > > > > great! > > > > > > > > > > > > > > > >