Which hadoop distro are you using? I've heard Hortonworks has a windows-compatible hadoop.
On Wed, Aug 28, 2013 at 2:36 PM, Darpan R <[email protected]> wrote: > Hi folks, > I am facing a wiered issue. > I am running PIG 0.11 on windows7/64 bit machine with latest version of > cygwin. > > I am a weblog which I want to order it by userName to have all the user > activities for the same user together to feed for next line of processing. > > I am starting commandprompt -> cygwin.bat -> on the cygwin console go to > D:/ -> pig and typing the following script on grunt shall (local mode). > (Note I've set PIG_HOME, PIG_CLASSPATH correctly). > > Script is : > USERACTIVITIES = LOAD '/D:/path/of/logs/useractivities' USING > org.apache.pig.piggybank.storage.CSVExcelStorage(',') AS > (datetimeUnProcessed:chararray, username:chararray, request:chararray); > USERACTIVITIES_ORDERED = ORDER USERACTIVITIES by username; > STORE USERACTIVITIES_ORDERED INTO '/D:/readyfornextinput/useractivities' > USING org.apache.pig.piggybank.storage.CSVExcelStorage(','); > > When I do illustrate USERACTIVITIES_ORDERED I see it going smooth. > But when I do store/dump I face wiered issue. > > It fails by saying : > java.lang.RuntimeException: > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path > does not exist: file:/D:/pigsample_1749383998_1377684507424 > > When I tried to search this pigsample_number file I could find that in : > D:/tmp/<username>/mapred/local/localRunner > > I am not sure how it is happening. > I am not sure if its windows/cygwin related issue or someone saw this on > Linux also. > > For reference, you can find the stacktrace attached here: > 2013-08-28 15:38:28,863 [Thread-46] WARN > org.apache.hadoop.mapred.LocalJobRunner - job_local_0004 > java.lang.RuntimeException: > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path > does not exist: file:/D:/pigsample_1749383998_1377684507424 > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > at > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214) > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: > Input path does not exist: file:/D:/pigsample_1288777582_1377684802262 > at > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37) > at > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) > at > org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190) > at > org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:126) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131) > ... 6 more > > Any help on this will be useful. > > Regards, > Darpan >
