[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-30 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Status: Resolved  (was: Patch Available)
Resolution: Fixed

unit tests passed. PIG-1649.5.patch committed to trunk and 0.8 branch.


 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch, PIG-1649.2.patch, PIG-1649.3.patch, 
 PIG-1649.4.patch, PIG-1649.5.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

  Status: Patch Available  (was: Open)
Hadoop Flags: [Reviewed]

 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch, PIG-1649.2.patch, PIG-1649.3.patch, 
 PIG-1649.4.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Attachment: PIG-1649.1.patch

Patch passes unit tests and test-patch .


 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Status: Patch Available  (was: Open)

 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Attachment: PIG-1649.2.patch

PIG-1649.2.patch
Addressing review comments from Richard 
-  pointed out that that hdfs Path class constructor can fail on valid Uri like 
the format used for jdbc. So this patch checks if the input location uri has a 
hdfs scheme before using the hdfs Path constructor.
- The code here can run into same problem as one in PIG-1652. The patch also 
includes changes to handle comma separated file names.

A better long term solution would be to have support in LoadFunc or related 
interfaces to check the input size and to check if parts of the file should be 
consolidated.



 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch, PIG-1649.2.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Status: Open  (was: Patch Available)

The patch also includes changes to fix the issue in PIG-1652 , since FRJoin 
code path also faces similar issue.



 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch, PIG-1649.2.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1649:
---

Attachment: PIG-1649.4.patch

New patch addressing comments from Richard
- In UriUtil.isHDFSFile(String uri) return false if uri is null
- Modified a test in TestFRJoin2 to use comma separated file name.

 FRJoin fails to compute number of input files for replicated input
 --

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1649.1.patch, PIG-1649.2.patch, PIG-1649.3.patch, 
 PIG-1649.4.patch


 In FRJoin, if input path has curly braces, it fails to compute number of 
 input files and logs the following exception in the log -
 10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of 
 input files
 java.net.URISyntaxException: Illegal character in path at index 12: 
 /user/tejas/{std*txt}
 at java.net.URI$Parser.fail(URI.java:2809)
 at java.net.URI$Parser.checkChars(URI.java:2982)
 at java.net.URI$Parser.parseHierarchical(URI.java:3066)
 at java.net.URI$Parser.parse(URI.java:3024)
 at java.net.URI.init(URI.java:578)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
 at org.apache.pig.PigServer.storeEx(PigServer.java:873)
 at org.apache.pig.PigServer.store(PigServer.java:815)
 at org.apache.pig.PigServer.openIterator(PigServer.java:727)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:453)
 at org.apache.pig.Main.main(Main.java:107)
 This does not cause a query to fail. But since the number of input files 
 don't get calculated, the optimizations added in PIG-1458 to reduce load on 
 name node will not get used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.