RE: More on issue with local vs mapreduce mode

Sameer Tilak Wed, 06 Nov 2013 14:10:23 -0800

Dear Serega,
I am now using log4j for debugging my UDF. Here is what I found out. For some 
reason in mapreduce mode my exec function does not get called.   The log 
message in the constructor gets printed onto the console. However, the log 
message in the exec funciton does not get printed to the console. In local mode 
all the log messages can be seen on the console.


 public Tuple exec(Tuple input) throws IOException {
        log.info("Into the exec function");
        // rest of the code
}


public CustomFilter ()
            {
               
                log.info("Hello World!");
        }


> From: serega.shey...@gmail.com
> Date: Wed, 6 Nov 2013 21:19:10 +0400
> Subject: Re: More on issue with local vs mapreduce mode
> To: user@pig.apache.org
> 
> You get 4 empty tuples.
> Maybe your UDF parser.customFilter(key,'AAAAA') works differently? Maybe
> you use the old version?
> You can add print statement to UDF and see what does it accept and what
> does produce.
> 
> 
> 2013/11/6 Sameer Tilak <ssti...@live.com>
> 
> > Dear Serega,
> >
> > When I run the script in local mode, I get correct o/p stored in
> > AU/part-m-000 file. However, when I run it in the mapreduce mode (with i/p
> > and o/p from HDFS), the file /scratch/AU/part-m-000 is of size 4 and there
> > is nothing in it.
> >
> > I am not sure whether AU relation somehow does not get realized correctly
> > or the problem happens during the storing stage.
> >
> > Here are some of  the console messages:
> >
> > HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt
> >  Features
> > 1.0.3    0.11.1    p529444    2013-11-06 07:14:15    2013-11-06 07:14:40
> >  UNKNOWN
> >
> > Success!
> >
> > Job Stats (time in seconds):
> > JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime
> >  MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime
> >  MedianReducetime    Alias    Feature    Outputs
> > job_201311011343_0042    1    0    6    6    6    6    0    0    0    0
> >  A,AU    MULTI_QUERY,MAP_ONLY    /scratch/A,/scratch/AU,
> >
> > Input(s):
> > Successfully read 4 records (311082 bytes) from: "/scratch/file.seq"
> >
> > Output(s):
> > Successfully stored 4 records (231 bytes) in: "/scratch/A"
> > Successfully stored 4 records (3 bytes) in: "/scratch/AU"
> >
> > Counters:
> > Total records written : 8
> > Total bytes written : 234
> > Spillable Memory Manager spill count : 0
> > Total bags proactively spilled: 0
> > Total records proactively spilled: 0
> >
> > Job DAG:
> > job_201311011343_0042
> >
> >
> > 2013-11-06 07:14:40,352 [main] INFO
> >  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - Success!
> >
> > Here is the o/p of following commands:
> >
> > hadoop --config $HADOOP_CONF_DIR fs -ls /scratch/AU/part-m-00000
> >
> > -rw-r--r--   1 username groupname          4 2013-11-06 07:14
> > /scratch/AU/part-m-00000
> >
> > I am not sure whether DUMP will give me correct result, but when I
> > replaced store by dump AU in the mapredue mode then I get
> > AU as:
> > ()
> > ()
> > ()
> > ()
> >
> > > From: serega.shey...@gmail.com
> > > Date: Wed, 6 Nov 2013 11:19:03 +0400
> > > Subject: Re: More on issue with local vs mapreduce mode
> > > To: user@pig.apache.org
> > >
> > > "The same script does not work in the mapreduce mode. "
> > > What does it mean?
> > >
> > >
> > > 2013/11/6 Sameer Tilak <ssti...@live.com>
> > >
> > > > Hello,
> > > >
> > > > My script in the local mode works perfectly. The same script does not
> > work
> > > > in the mapreduce mode. For the local mode, the o/p is saved in the
> > current
> > > > directory, where as for the mapreduce mode I use /scrach directory on
> > HDFS.
> > > >
> > > > Local mode:
> > > >
> > > > A = LOAD 'file.seq' USING SequenceFileLoader AS (key: chararray, value:
> > > > chararray);
> > > > DESCRIBE A;
> > > > STORE A into 'A';
> > > >
> > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > > > STORE AU into 'AU';
> > > >
> > > >
> > > > Mapreduce mode:
> > > >
> > > > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key:
> > chararray,
> > > > value: chararray);
> > > > DESCRIBE A;
> > > > STORE A into '/scratch/A';
> > > >
> > > > AU = FOREACH A GENERATE FLATTEN(parser.customFilter(key,'AAAAA'));
> > > > STORE AU into '/scratch/AU';
> > > >
> > > > Can someone please point me to tools that I can use to debug the
> > script in
> > > > mapreduce mode? Also, any thoughts on why this might be happening
> > would be
> > > > great!
> > > >
> > > >
> >
> >
> >
> >

RE: More on issue with local vs mapreduce mode

Reply via email to