Show the code if your func. How did you define input and output schema? 07.11.2013 2:09 пользователь "Sameer Tilak" <ssti...@live.com> написал:
> Dear Serega, > I am now using log4j for debugging my UDF. Here is what I found out. For > some reason in mapreduce mode my exec function does not get called. The > log message in the constructor gets printed onto the console. However, the > log message in the exec funciton does not get printed to the console. In > local mode all the log messages can be seen on the console. > > public Tuple exec(Tuple input) throws IOException { > log.info("Into the exec function"); > // rest of the code > } > > > public CustomFilter () > { > > log.info("Hello World!"); > } > > > > From: serega.shey...@gmail.com > > Date: Wed, 6 Nov 2013 21:19:10 +0400 > > Subject: Re: More on issue with local vs mapreduce mode > > To: user@pig.apache.org > > > > You get 4 empty tuples. > > Maybe your UDF parser.customFilter(key,'AAAAA') works differently? Maybe > > you use the old version? > > You can add print statement to UDF and see what does it accept and what > > does produce. > > > > > > 2013/11/6 Sameer Tilak <ssti...@live.com> > > > > > Dear Serega, > > > > > > When I run the script in local mode, I get correct o/p stored in > > > AU/part-m-000 file. However, when I run it in the mapreduce mode (with > i/p > > > and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and > there > > > is nothing in it. > > > > > > I am not sure whether AU relation somehow does not get realized > correctly > > > or the problem happens during the storing stage. > > > > > > Here are some of the console messages: > > > > > > HadoopVersion PigVersion UserId StartedAt FinishedAt > > > Features > > > 1.0.3 0.11.1 p529444 2013-11-06 07:14:15 2013-11-06 > 07:14:40 > > > UNKNOWN > > > > > > Success! > > > > > > Job Stats (time in seconds): > > > JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime > > > MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime > > > MedianReducetime Alias Feature Outputs > > > job_201311011343_0042 1 0 6 6 6 6 0 0 0 0 > > > A,AU MULTI_QUERY,MAP_ONLY /scratch/A,/scratch/AU, > > > > > > Input(s): > > > Successfully read 4 records (311082 bytes) from: "/scratch/file.seq" > > > > > > Output(s): > > > Successfully stored 4 records (231 bytes) in: "/scratch/A" > > > Successfully stored 4 records (3 bytes) in: "/scratch/AU" > > > > > > Counters: > > > Total records written : 8 > > > Total bytes written : 234 > > > Spillable Memory Manager spill count : 0 > > > Total bags proactively spilled: 0 > > > Total records proactively spilled: 0 > > > > > > Job DAG: > > > job_201311011343_0042 > > > > > > > > > 2013-11-06 07:14:40,352 [main] INFO > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > - Success! > > > > > > Here is the o/p of following commands: > > > > > > hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000 > > > > > > -rw-r--r-- 1 username groupname 4 2013-11-06 07:14 > > > /scratch/AU/part-m-00000 > > > > > > I am not sure whether DUMP will give me correct result, but when I > > > replaced store by dump AU in the mapredue mode then I get > > > AU as: > > > () > > > () > > > () > > > () > > > > > > > From: serega.shey...@gmail.com > > > > Date: Wed, 6 Nov 2013 11:19:03 +0400 > > > > Subject: Re: More on issue with local vs mapreduce mode > > > > To: user@pig.apache.org > > > > > > > > "The same script does not work in the mapreduce mode. " > > > > What does it mean? > > > > > > > > > > > > 2013/11/6 Sameer Tilak <ssti...@live.com> > > > > > > > > > Hello, > > > > > > > > > > My script in the local mode works perfectly. The same script does > not > > > work > > > > > in the mapreduce mode. For the local mode, the o/p is saved in the > > > current > > > > > directory, where as for the mapreduce mode I use /scrach directory > on > > > HDFS. > > > > > > > > > > Local mode: > > > > > > > > > > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, > value: > > > > > chararray); > > > > > DESCRIBE A; > > > > > STORE A into 'A'; > > > > > > > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > > > > STORE AU into 'AU'; > > > > > > > > > > > > > > > Mapreduce mode: > > > > > > > > > > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: > > > chararray, > > > > > value: chararray); > > > > > DESCRIBE A; > > > > > STORE A into '/scratch/A'; > > > > > > > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > > > > STORE AU into '/scratch/AU'; > > > > > > > > > > Can someone please point me to tools that I can use to debug the > > > script in > > > > > mapreduce mode? Also, any thoughts on why this might be happening > > > would be > > > > > great! > > > > > > > > > > > > > > > > > > > > > > >