[ 
https://issues.apache.org/jira/browse/PIG-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved PIG-1652.
--------------------------------

    Resolution: Duplicate

Marking as duplicate of PIG-1649 because the code path to consolidate input 
files in FRJoin also has the same issue. 


> TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to 
> estimateNumberOfReducers bug
> --------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1652
>                 URL: https://issues.apache.org/jira/browse/PIG-1652
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> TestSortedTableUnion and TestSortedTableUnionMergeJoin fail on trunk due to 
> the input size estimation. Here is the stack of TestSortedTableUnionMergeJoin:
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias records3
>         at org.apache.pig.PigServer.storeEx(PigServer.java:877)
>         at org.apache.pig.PigServer.store(PigServer.java:815)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:727)
>         at 
> org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer(TestSortedTableUnionMergeJoin.java:203)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043: 
> Unexpected error during execution.
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:326)
>         at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
>         at org.apache.pig.PigServer.storeEx(PigServer.java:873)
> Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Illegal character in scheme name at index 69: 
> org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
>         at org.apache.hadoop.fs.Path.initialize(Path.java:140)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:126)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:50)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:963)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:966)
>         at 
> org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:902)
>         at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:866)
>         at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:844)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:715)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:688)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visitMROp(SampleOptimizer.java:140)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41)
>         at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69)
>         at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71)
>         at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SampleOptimizer.visit(SampleOptimizer.java:69)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:491)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
> Caused by: java.net.URISyntaxException: Illegal character in scheme name at 
> index 69: 
> org.apache.hadoop.zebra.pig.TestSortedTableUnionMergeJoin.testStorer1,file:
>         at java.net.URI$Parser.fail(URI.java:2809)
>         at java.net.URI$Parser.checkChars(URI.java:2982)
>         at java.net.URI$Parser.parse(URI.java:3009)
>         at java.net.URI.<init>(URI.java:736)
>         at org.apache.hadoop.fs.Path.initialize(Path.java:137)
> The reason is we are trying to do globStatus on a URL which is a comma 
> seperated list. Here is the URL we get in 
> JobControlCompiler.getTotalInputFileSize:
> file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer1,file:///homes/jianyong/pig2/build/contrib/zebra/test/data/org.apache.hadoop.zebra.pig.TestSortedTableUnion.testStorer2

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to