From the error logs, it seems like input file doesn't exist or not accessible.
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: > Input path does not exist: > file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017 can you please check if the input path in $LOGS is proper? Thanks -- Prasanth On Apr 12, 2013, at 11:02 PM, Lei Liu <[email protected]> wrote: > Hi, I am using Pig to analyze the percentage of each UserAgents from an > apache log. The following program failed because of ORDER command at the > very last (the result variable is correct and can be dumped out correctly). > I am relative new to Pig and could not figure it out so need you guys to > help. Following is the program and error message. Thanks! > > logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost, hyphen, > user, time, method, uri, protocol, statusCode, responseSize, referer, > userAgent); > > uarows = FOREACH logs GENERATE userAgent; > total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count; > dump total; > > gpuarows = GROUP uarows BY userAgent; > result = FOREACH gpuarows { > subtotal = COUNT(uarows); > GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL, > 100*(double)subtotal/(double)total.count AS percentage; > }; > orderresult = ORDER result BY SUB_TOTAL DESC; > dump orderresult; > > -- what's weird is that 'dump result' works just fine, so it's the ORDER > line makes trouble > > Errors: > 2013-04-13 10:36:32,409 [Thread-48] INFO org.apache.hadoop.mapred.MapTask > - record buffer = 262144/327680 > 2013-04-13 10:36:32,437 [Thread-48] WARN > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005 > java.lang.RuntimeException: > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path > does not exist: > file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017 > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: > Input path does not exist: > file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017 > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) > at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:177) > at > org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:124) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131) > ... 6 more > 2013-04-13 10:36:32,525 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - HadoopJobId: job_local_0005 > 2013-04-13 10:36:32,526 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Processing aliases orderresult > 2013-04-13 10:36:32,526 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - detailed locations: M: orderresult[19,14] C: R: > 2013-04-13 10:36:37,536 [main] WARN > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to > stop immediately on failure. > 2013-04-13 10:36:37,536 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - job job_local_0005 has failed! Stop running all dependent jobs > 2013-04-13 10:36:37,536 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 100% complete > 2013-04-13 10:36:37,537 [main] ERROR > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! > 2013-04-13 10:36:37,538 [main] INFO > org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: > > HadoopVersion PigVersion UserId StartedAt FinishedAt Features > 1.0.4 0.11.0 dliu 2013-04-13 10:35:50 2013-04-13 10:36:37 > GROUP_BY,ORDER_BY > > Some jobs have failed! Stop running all dependent jobs > > Job Stats (time in seconds): > JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime > MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime > MedianReducetime Alias Feature Outputs > job_local_0002 1 1 n/a n/a n/a n/a n/a n/a > 1-18,logs,total,uarows MULTI_QUERY,COMBINER > job_local_0003 1 1 n/a n/a n/a n/a n/a n/a > gpuarows,result GROUP_BY,COMBINER > job_local_0004 1 1 n/a n/a n/a n/a n/a n/a > orderresult SAMPLER > > Failed Jobs: > JobId Alias Feature Message Outputs > job_local_0005 orderresult ORDER_BY Message: Job failed! Error - > NA file:/tmp/temp-1225021115/tmp-62411972, > > Input(s): > Successfully read 0 records from: > "file:///home/dliu/ApacheLogAnalysisWithPig/access.log" > > Output(s): > Failed to produce result in "file:/tmp/temp-1225021115/tmp-62411972" > > Counters: > Total records written : 0 > Total bytes written : 0 > Spillable Memory Manager spill count : 0 > Total bags proactively spilled: 0 > Total records proactively spilled: 0 > > Job DAG: > job_local_0002 -> job_local_0003, > job_local_0003 -> job_local_0004, > job_local_0004 -> job_local_0005, > job_local_0005 > > > 2013-04-13 10:36:37,539 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Some jobs have failed! Stop running all dependent jobs > 2013-04-13 10:36:37,541 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1066: Unable to open iterator for alias orderresult > Details at logfile: > /home/dliu/ApacheLogAnalysisWithPig/pig_1365820535568.log
