Dear Serega, When I run the script in local mode, I get correct o/p stored in AU/part-m-000 file. However, when I run it in the mapreduce mode (with i/p and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and there is nothing in it.
I am not sure whether AU relation somehow does not get realized correctly or the problem happens during the storing stage. Here are some of the console messages: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 1.0.3 0.11.1 p529444 2013-11-06 07:14:15 2013-11-06 07:14:40 UNKNOWN Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_201311011343_0042 1 0 6 6 6 6 0 0 0 0 A,AU MULTI_QUERY,MAP_ONLY /scratch/A,/scratch/AU, Input(s): Successfully read 4 records (311082 bytes) from: "/scratch/file.seq" Output(s): Successfully stored 4 records (231 bytes) in: "/scratch/A" Successfully stored 4 records (3 bytes) in: "/scratch/AU" Counters: Total records written : 8 Total bytes written : 234 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_201311011343_0042 2013-11-06 07:14:40,352 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! Here is the o/p of following commands: hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000 -rw-r--r-- 1 username groupname 4 2013-11-06 07:14 /scratch/AU/part-m-00000 I am not sure whether DUMP will give me correct result, but when I replaced store by dump AU in the mapredue mode then I get AU as: () () () () > From: serega.shey...@gmail.com > Date: Wed, 6 Nov 2013 11:19:03 +0400 > Subject: Re: More on issue with local vs mapreduce mode > To: user@pig.apache.org > > "The same script does not work in the mapreduce mode. " > What does it mean? > > > 2013/11/6 Sameer Tilak <ssti...@live.com> > > > Hello, > > > > My script in the local mode works perfectly. The same script does not work > > in the mapreduce mode. For the local mode, the o/p is saved in the current > > directory, where as for the mapreduce mode I use /scrach directory on HDFS. > > > > Local mode: > > > > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, value: > > chararray); > > DESCRIBE A; > > STORE A into 'A'; > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > STORE AU into 'AU'; > > > > > > Mapreduce mode: > > > > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray, > > value: chararray); > > DESCRIBE A; > > STORE A into '/scratch/A'; > > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA')); > > STORE AU into '/scratch/AU'; > > > > Can someone please point me to tools that I can use to debug the script in > > mapreduce mode? Also, any thoughts on why this might be happening would be > > great! > > > >