Thanks. I changed two things and it is working now.
1. added "pig -x local .." and it works now. I used to run pig (local mode)
without explicitly specify "-x local"

2. Set the PIG_INSTALL environment variable to point to right version of
Pig installation. It was pointing to an old version.

Any explanation the mechanism behind this or these?

BR


On Sun, Apr 14, 2013 at 7:53 AM, Ruslan Al-Fakikh <[email protected]>wrote:

> Hi Lei,
>
> It seems there is something wrong with creating a sampler. The ORDER
>  command is not trivial, it works by creating a sampler. I guess something
> went wrong with it:
> Input path
> does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> I suppose pigsample is not a name that you used in your script, so maybe
> Pig failed to create a sample file. Try to run the job on HDFS, we'll see
> what happens. I see that you are using the local filesystem: file:/....
>
> Best Regards
>
>
> On Sat, Apr 13, 2013 at 1:56 PM, Lei Liu <[email protected]> wrote:
>
> > I am sure it's not that. The ORDER command fails the whole thing. If I
> > remove the ORDER command, the same script runs just fine except the
> result
> > is not in order.
> >
> >
> > On Sat, Apr 13, 2013 at 4:54 PM, Prasanth J <[email protected]
> > >wrote:
> >
> > > From the error logs, it seems like input file doesn't exist or not
> > > accessible.
> > >
> > > > Caused by:
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > > Input path does not exist:
> > > >
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > >
> > > can you please check if the input path in $LOGS is proper?
> > >
> > > Thanks
> > > -- Prasanth
> > >
> > > On Apr 12, 2013, at 11:02 PM, Lei Liu <[email protected]> wrote:
> > >
> > > > Hi, I am using Pig to analyze the percentage of each UserAgents from
> an
> > > > apache log. The following program failed because of ORDER command at
> > the
> > > > very last (the result variable is correct and can be dumped out
> > > correctly).
> > > > I am relative new to Pig and could not figure it out so need you guys
> > to
> > > > help. Following is the program and error message. Thanks!
> > > >
> > > > logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost,
> > hyphen,
> > > > user, time, method, uri, protocol, statusCode, responseSize, referer,
> > > > userAgent);
> > > >
> > > > uarows = FOREACH logs GENERATE userAgent;
> > > > total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> > > > dump total;
> > > >
> > > > gpuarows = GROUP uarows BY userAgent;
> > > > result = FOREACH gpuarows {
> > > >       subtotal = COUNT(uarows);
> > > >       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> > > > 100*(double)subtotal/(double)total.count AS percentage;
> > > >       };
> > > > orderresult = ORDER result BY SUB_TOTAL DESC;
> > > > dump orderresult;
> > > >
> > > > -- what's weird is that 'dump result' works just fine, so it's the
> > ORDER
> > > > line makes trouble
> > > >
> > > > Errors:
> > > > 2013-04-13 10:36:32,409 [Thread-48] INFO
> > >  org.apache.hadoop.mapred.MapTask
> > > > - record buffer = 262144/327680
> > > > 2013-04-13 10:36:32,437 [Thread-48] WARN
> > > > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> > > > java.lang.RuntimeException:
> > > > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
> path
> > > > does not exist:
> > > >
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > > >    at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
> > > >    at
> > > >
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
> > > >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> > > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > > >    at
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> > > > Caused by:
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > > > Input path does not exist:
> > > >
> > >
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> > > >    at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> > > >    at
> > > org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:177)
> > > >    at
> > > >
> org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:124)
> > > >    at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131)
> > > >    ... 6 more
> > > > 2013-04-13 10:36:32,525 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - HadoopJobId: job_local_0005
> > > > 2013-04-13 10:36:32,526 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - Processing aliases orderresult
> > > > 2013-04-13 10:36:32,526 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - detailed locations: M: orderresult[19,14] C:  R:
> > > > 2013-04-13 10:36:37,536 [main] WARN
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - Ooops! Some job has failed! Specify -stop_on_failure if you want
> Pig
> > to
> > > > stop immediately on failure.
> > > > 2013-04-13 10:36:37,536 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - job job_local_0005 has failed! Stop running all dependent jobs
> > > > 2013-04-13 10:36:37,536 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - 100% complete
> > > > 2013-04-13 10:36:37,537 [main] ERROR
> > > > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s)
> > failed!
> > > > 2013-04-13 10:36:37,538 [main] INFO
> > > > org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
> > > >
> > > > HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt
> > >  Features
> > > > 1.0.4    0.11.0    dliu    2013-04-13 10:35:50    2013-04-13 10:36:37
> > > > GROUP_BY,ORDER_BY
> > > >
> > > > Some jobs have failed! Stop running all dependent jobs
> > > >
> > > > Job Stats (time in seconds):
> > > > JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime
> > > > MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime
> > > > MedianReducetime    Alias    Feature    Outputs
> > > > job_local_0002    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > > > 1-18,logs,total,uarows    MULTI_QUERY,COMBINER
> > > > job_local_0003    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > > > gpuarows,result    GROUP_BY,COMBINER
> > > > job_local_0004    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > > > orderresult    SAMPLER
> > > >
> > > > Failed Jobs:
> > > > JobId    Alias    Feature    Message    Outputs
> > > > job_local_0005    orderresult    ORDER_BY    Message: Job failed!
> > Error -
> > > > NA    file:/tmp/temp-1225021115/tmp-62411972,
> > > >
> > > > Input(s):
> > > > Successfully read 0 records from:
> > > > "file:///home/dliu/ApacheLogAnalysisWithPig/access.log"
> > > >
> > > > Output(s):
> > > > Failed to produce result in "file:/tmp/temp-1225021115/tmp-62411972"
> > > >
> > > > Counters:
> > > > Total records written : 0
> > > > Total bytes written : 0
> > > > Spillable Memory Manager spill count : 0
> > > > Total bags proactively spilled: 0
> > > > Total records proactively spilled: 0
> > > >
> > > > Job DAG:
> > > > job_local_0002    ->    job_local_0003,
> > > > job_local_0003    ->    job_local_0004,
> > > > job_local_0004    ->    job_local_0005,
> > > > job_local_0005
> > > >
> > > >
> > > > 2013-04-13 10:36:37,539 [main] INFO
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > - Some jobs have failed! Stop running all dependent jobs
> > > > 2013-04-13 10:36:37,541 [main] ERROR
> org.apache.pig.tools.grunt.Grunt -
> > > > ERROR 1066: Unable to open iterator for alias orderresult
> > > > Details at logfile:
> > > > /home/dliu/ApacheLogAnalysisWithPig/pig_1365820535568.log
> > >
> > >
> >
>

Reply via email to