Hi Mohnish, I am not very familiar with har files, so I might be wrong here.
Looking at the call stack, the exception is thrown from initialize(URI name, Configuration conf) in HarFileSystem.java. In the source code, the comment of this method says the following: Initialize a Har filesystem per har archive. The > archive home directory is the top level directory > in the filesystem that contains the HAR archive. This sounds to me that HarFileSystem expects a single path. This gives error due to the curly braces being encoded to %7B and %7D. The encoded curly braces should be fine though. In fact, if they're not encoded, that's a problem because then a URISyntaxException will be thrown by Java URI class. Hope that this helps, Cheolsoo On Tue, Sep 25, 2012 at 12:43 PM, Mohnish Kodnani <[email protected] > wrote: > Hi, > I am trying to give multiple paths to a pig script using path globbing in > HAR file format and it does not seem to work. I wanted to know if this is > expected or a bug / feature request. > > Command : > x = LOAD 'har:///a/b/c/{d.har,e.har}/z/ab/*' using PigStorage('\t'); > > This gives error due to the curly braces being encoded to %7B and %7D. > I am trying this on Pig 0.8.0 > > ERROR 2017: Internal error creating job configuration. > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias blah > at org.apache.pig.PigServer.openIterator(PigServer.java:765) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:615) > at > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > at org.apache.pig.Main.run(Main.java:455) > at org.apache.pig.Main.main(Main.java:107) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias > blah > at org.apache.pig.PigServer.storeEx(PigServer.java:889) > at org.apache.pig.PigServer.store(PigServer.java:827) > at org.apache.pig.PigServer.openIterator(PigServer.java:739) > ... 7 more > Caused by: > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:679) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:256) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:147) > at > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:382) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1209) > at org.apache.pig.PigServer.storeEx(PigServer.java:885) > ... 9 more > Caused by: java.io.IOException: Invalid path for the Har Filesystem. > > har:///user/cronusapp/cassini_downsample_logs/prod/2012/09/%7B22.har,23.har%7D/00/* > at > org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:100) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1563) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:225) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183) > at > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:348) > at > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:317) > at > org.apache.pig.builtin.PigStorage.setLocation(PigStorage.java:219) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:369) > ... 14 more >
