I think its pig related because if i do hadoop fs -ls on the har file path with input globbing it works fine.
On Tue, Sep 25, 2012 at 7:45 PM, Cheolsoo Park <[email protected]>wrote: > Sounds like I was wrong. ;-) > > You might get a better answer from hadoop user group since this is more > related to HarFileSystem than Pig I think. > > Thanks, > Cheolsoo > > On Tue, Sep 25, 2012 at 6:20 PM, Mohnish Kodnani > <[email protected]>wrote: > > > Hi Chelsoo, > > thanks for replying. On the same system the following works : > > > > x = load 'har:///a/b/b/22.har/00/*,har:///a/b/c/d/23.har/00/*' using > > PigStorage('\t'); > > > > Two separate file paths with har protocol work. > > > > A single path works but if I do the following I get an error. > > x = LOAD 'har:///a/b/c/{d.har,e.har}/z/ab/*' using PigStorage('\t'); > > > > Thanks > > Mohnish > > > > On Tue, Sep 25, 2012 at 6:09 PM, Cheolsoo Park <[email protected] > > >wrote: > > > > > Hi Mohnish, > > > > > > I am not very familiar with har files, so I might be wrong here. > > > > > > Looking at the call stack, the exception is thrown from initialize(URI > > > name, Configuration conf) in HarFileSystem.java. In the source code, > the > > > comment of this method says the following: > > > > > > Initialize a Har filesystem per har archive. The > > > > archive home directory is the top level directory > > > > in the filesystem that contains the HAR archive. > > > > > > > > > This sounds to me that HarFileSystem expects a single path. > > > > > > > > > This gives error due to the curly braces being encoded to %7B and %7D. > > > > > > > > > The encoded curly braces should be fine though. In fact, if they're not > > > encoded, that's a problem because then a URISyntaxException will be > > thrown > > > by Java URI class. > > > > > > Hope that this helps, > > > Cheolsoo > > > > > > > > > On Tue, Sep 25, 2012 at 12:43 PM, Mohnish Kodnani < > > > [email protected] > > > > wrote: > > > > > > > Hi, > > > > I am trying to give multiple paths to a pig script using path > globbing > > in > > > > HAR file format and it does not seem to work. I wanted to know if > this > > is > > > > expected or a bug / feature request. > > > > > > > > Command : > > > > x = LOAD 'har:///a/b/c/{d.har,e.har}/z/ab/*' using PigStorage('\t'); > > > > > > > > This gives error due to the curly braces being encoded to %7B and > %7D. > > > > I am trying this on Pig 0.8.0 > > > > > > > > ERROR 2017: Internal error creating job configuration. > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: > Unable > > to > > > > open iterator for alias blah > > > > at org.apache.pig.PigServer.openIterator(PigServer.java:765) > > > > at > > > > > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:615) > > > > at > > > > > > > > > > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > > > > at > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > > > > at > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) > > > > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > > > > at org.apache.pig.Main.run(Main.java:455) > > > > at org.apache.pig.Main.main(Main.java:107) > > > > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store > > alias > > > > blah > > > > at org.apache.pig.PigServer.storeEx(PigServer.java:889) > > > > at org.apache.pig.PigServer.store(PigServer.java:827) > > > > at org.apache.pig.PigServer.openIterator(PigServer.java:739) > > > > ... 7 more > > > > Caused by: > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > > > > ERROR 2017: Internal error creating job configuration. > > > > at > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:679) > > > > at > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:256) > > > > at > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:147) > > > > at > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:382) > > > > at > > > > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1209) > > > > at org.apache.pig.PigServer.storeEx(PigServer.java:885) > > > > ... 9 more > > > > Caused by: java.io.IOException: Invalid path for the Har Filesystem. > > > > > > > > > > > > > > har:///user/cronusapp/cassini_downsample_logs/prod/2012/09/%7B22.har,23.har%7D/00/* > > > > at > > > > org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100) > > > > at > > > > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1563) > > > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:225) > > > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:348) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:317) > > > > at > > > > org.apache.pig.builtin.PigStorage.setLocation(PigStorage.java:219) > > > > at > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:369) > > > > ... 14 more > > > > > > > > > >
