[jira] Updated: (PIG-1391) pig unit tests leave behind files in temp directory because MiniCluster files don't get deleted

2010-04-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1391:
---

Attachment: PIG-1391.06.2.patch

New patch for 0.6 branch. Made a change to build.xml - A new dir will be 
created within then temp dir and this dir will be used as the temp dir for 
tests. After the tests are done, this dir will be deleted, so that no temp 
files are left behind.


 pig unit tests leave behind files in temp directory because MiniCluster files 
 don't get deleted
 ---

 Key: PIG-1391
 URL: https://issues.apache.org/jira/browse/PIG-1391
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.7.0

 Attachments: minicluster.patch, PIG-1391.06.2.patch, PIG-1391.06.patch


 Pig unit test runs leave behind files in temp dir (/tmp) and there are too 
 many files in the directory over time.
 Most of the files are left behind by MiniCluster . It closes/shutsdown 
 MiniDFSCluster, MiniMRCluster and the FileSystem that it has created when the 
 constructor is called, only in finalize(). And java does not guarantee that 
 finalize() will be called. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1378:


Status: Patch Available  (was: Open)

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245)
 {noformat}
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment 

[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1378:


Attachment: PIG-1378-3.patch

Looks like the golden file change from the last patch was not correct - updated 
patch with just that change attached - all unit tests ran successfully locally 
with this new patch - patch is ready for review

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 

[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1378:


Status: Open  (was: Patch Available)

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245)
 {noformat}
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment 

[jira] Updated: (PIG-1396) eclipse-files target in build.xml fails to generate necessary classes in src-gen

2010-04-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1396:
---

Status: Patch Available  (was: Open)

 eclipse-files target in build.xml fails to generate necessary classes in 
 src-gen
 

 Key: PIG-1396
 URL: https://issues.apache.org/jira/browse/PIG-1396
 Project: Pig
  Issue Type: Bug
  Components: build
Affects Versions: 0.8.0
Reporter: V.V.Chaitanya Krishna
Assignee: V.V.Chaitanya Krishna
 Fix For: 0.8.0

 Attachments: PIG-1396.patch


 With PIG-1390 checked in, ant eclipse-files does only the ivy-download part 
 but does not compile and generate the necessary classes in src-gen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1378:


Status: Open  (was: Patch Available)

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, 
 PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245)
 {noformat}
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this 

[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1378:


Attachment: PIG-1378-4.patch

Realized that a stray change in TestMRCompiler got into my previous patch - 
attaching new patch with just that change removed.

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, 
 PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 

[jira] Commented: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862405#action_12862405
 ] 

Richard Ding commented on PIG-1378:
---

+1

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, 
 PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 Caused by: java.io.IOException: No FileSystem for scheme: namenode-location
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
 at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
 at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
 at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
 at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
 at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
 at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
 at 
 .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
 at 
 .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245)
 {noformat}
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can 

[jira] Commented: (PIG-1378) har url not usable in Pig scripts

2010-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862417#action_12862417
 ] 

Hadoop QA commented on PIG-1378:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12443202/PIG-1378-3.patch
  against trunk revision 938733.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 64 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/console

This message is automatically generated.

 har url not usable in Pig scripts
 -

 Key: PIG-1378
 URL: https://issues.apache.org/jira/browse/PIG-1378
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.8.0

 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, 
 PIG-1378.patch


 I am trying to use har (Hadoop Archives) in my Pig script.
 I can use them through the HDFS shell
 {noformat}
 $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
 Found 1 items
 -rw---   5 viraj users1537234 2010-04-14 09:49 
 user/viraj/project/subproject/files/size/data/part-1
 {noformat}
 Using similar URL's in grunt yields
 {noformat}
 grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible 
 file URI scheme: har : hdfs
 2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There 
 is no log file to write to.
 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
 java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
 at org.apache.pig.Main.main(Main.java:357)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
 Incompatible file URI scheme: har : hdfs
 at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
 at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
 ... 13 more
 {noformat}
 According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the 
 following as stated in the original description
 {noformat}
 grunt a = load 
 'har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 grunt dump a;
 {noformat}
 {noformat}
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
 Unable to create input splits for: 
 har://namenode-location/user/viraj/project/subproject/files/size/data'; 
 ... 8 more
 

[jira] Commented: (PIG-1396) eclipse-files target in build.xml fails to generate necessary classes in src-gen

2010-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862418#action_12862418
 ] 

Hadoop QA commented on PIG-1396:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12443165/PIG-1396.patch
  against trunk revision 938733.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/console

This message is automatically generated.

 eclipse-files target in build.xml fails to generate necessary classes in 
 src-gen
 

 Key: PIG-1396
 URL: https://issues.apache.org/jira/browse/PIG-1396
 Project: Pig
  Issue Type: Bug
  Components: build
Affects Versions: 0.8.0
Reporter: V.V.Chaitanya Krishna
Assignee: V.V.Chaitanya Krishna
 Fix For: 0.8.0

 Attachments: PIG-1396.patch


 With PIG-1390 checked in, ant eclipse-files does only the ivy-download part 
 but does not compile and generate the necessary classes in src-gen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-740) Incorrect line number is generated when a string with double quotes is used instead of single quotes and is passed to UDF

2010-04-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-740:
---

Attachment: PIG-740.patch

GruntParser was not handling double quotes within foreach blocks correctly by 
incorrectly treating them the same way as single quote for the starting double 
quote and not handling the end double quote - the patch addresses the bug by 
treating double quotes correctly.

 Incorrect line number is generated when a string  with double quotes is used 
 instead of single quotes and is passed to UDF
 --

 Key: PIG-740
 URL: https://issues.apache.org/jira/browse/PIG-740
 Project: Pig
  Issue Type: Bug
  Components: grunt
Affects Versions: 0.3.0
Reporter: Viraj Bhat
Priority: Minor
 Attachments: PIG-740.patch


 Consider the Pig script with the error that a String with double quotes 
 {code}www\\.{code} is used instead of a single quote {code}'www\\.'{code} 
 in the UDF string.REPLACEALL()
 {code}
 register string-2.0.jar;
 A = load 'inputdata' using PigStorage() as ( curr_searchQuery );
 B = foreach A {
 domain = string.REPLACEALL(curr_searchQuery,^www\\.,'');
 generate
 domain;
 };
 dump B;
 {code}
 I get the following error message where Line 11 points to the end of file. 
 The error message should point to Line 5.
 ===
 2009-03-31 01:33:38,403 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
 to hadoop file system at: hdfs://localhost:9000
 2009-03-31 01:33:39,168 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
 to map-reduce job tracker at: localhost:9001
 2009-03-31 01:33:39,589 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1000: Error during parsing. Lexical error at line 11, column 0.  Encountered: 
 EOF after : 
 Details at logfile: /home/viraj/pig-svn/trunk/pig_1238463218046.log
 ===
 The log file contains the following contents
 ===
 ERROR 1000: Error during parsing. Lexical error at line 11, column 0.  
 Encountered: EOF after : 
 org.apache.pig.tools.pigscript.parser.TokenMgrError: Lexical error at line 
 11, column 0.  Encountered: EOF after : 
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParserTokenManager.getNextToken(PigScriptParserTokenManager.java:2739)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.jj_ntk(PigScriptParser.java:778)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:89)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
 at org.apache.pig.Main.main(Main.java:352)
 ===

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1397) GruntParser should invoke executeBatch() first in processFsCommand()

2010-04-29 Thread Pradeep Kamath (JIRA)
GruntParser should invoke executeBatch() first in processFsCommand()


 Key: PIG-1397
 URL: https://issues.apache.org/jira/browse/PIG-1397
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Pradeep Kamath
 Fix For: 0.8.0


If a script has multiple stores which can be optimized using multiquery 
optimization and if the script also has some file system state modifiying 
commands like cp, mv, rm then currently Gruntparser executes the plan until 
the filesystem command so that multiquery optimization will work on the file 
system after it has been modified (for example some portion of the multi query 
optimized script might be depending on the cp/mv or rm command to have run 
first). This is not done for fs ... commands - GruntParser should do the same 
even for fs .. commands in processFsCommand()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1394) POCombinerPackage hold too much memory for InternalCachedBag

2010-04-29 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862430#action_12862430
 ] 

Pradeep Kamath commented on PIG-1394:
-

+1

 POCombinerPackage hold too much memory for InternalCachedBag
 

 Key: PIG-1394
 URL: https://issues.apache.org/jira/browse/PIG-1394
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0

 Attachments: PIG-1394-1.patch, PIG-1394-2.patch


 In POCombinerPackage, we create bunch of InternalCachedBag, the number of 
 which is the number of algebraic UDFs we use. However, when we create 
 InternalCachedBag, we use the default construct which assume we only create 1 
 InternalCachedBag in the system. It turns out we reserve way to much memory 
 to InternalCachedBag.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



pig 0.6

2010-04-29 Thread Dmitriy Ryaboy
Is anyone running unpatched  0.6 anywhere? I am in the process of
putting together a jar for us, and getting worried about all the
PruneColumns optimization fixes that came after the 0.6 release.

-D


[jira] Created: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package

2010-04-29 Thread Alan Gates (JIRA)
Marking Pig interfaces for org.apache.pig.data package
--

 Key: PIG-1398
 URL: https://issues.apache.org/jira/browse/PIG-1398
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.8.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor
 Fix For: 0.8.0


Marking Pig interfaces for stability and audience, as well as javadoc cleanup, 
for the data package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package

2010-04-29 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1398:


Parent: PIG-1311
Issue Type: Sub-task  (was: Bug)

 Marking Pig interfaces for org.apache.pig.data package
 --

 Key: PIG-1398
 URL: https://issues.apache.org/jira/browse/PIG-1398
 Project: Pig
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.8.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor
 Fix For: 0.8.0


 Marking Pig interfaces for stability and audience, as well as javadoc 
 cleanup, for the data package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package

2010-04-29 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1398:


Attachment: PIG-1398.patch

 Marking Pig interfaces for org.apache.pig.data package
 --

 Key: PIG-1398
 URL: https://issues.apache.org/jira/browse/PIG-1398
 Project: Pig
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.8.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor
 Fix For: 0.8.0

 Attachments: PIG-1398.patch


 Marking Pig interfaces for stability and audience, as well as javadoc 
 cleanup, for the data package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-62) Need to add pig script and input dirs (in clear text format) to jobconf

2010-04-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-62:
--

Fix Version/s: 0.8.0

Additional information including this one will be added to job.xml

 Need to add pig script and input dirs (in clear text format) to jobconf
 ---

 Key: PIG-62
 URL: https://issues.apache.org/jira/browse/PIG-62
 Project: Pig
  Issue Type: Improvement
Reporter: Runping Qi
Assignee: Richard Ding
 Fix For: 0.8.0

 Attachments: PIG-62-v01.patch, PIG-62-v02.patch, PIG-62-v03.patch


 We'd like to know what pig query each pig related map/reduce job is for, and 
 what input dir the pig job use.
 It will be nice to add pig script and input dirs (in clear text format) to 
 the jobconfs of the jobs created for a pig script.
 .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-112) Allow Java packages to be searched for functions.

2010-04-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-112:
---

Status: Resolved  (was: Patch Available)
Resolution: Fixed

This has been done as part of Pig 0.6.0 release

 Allow Java packages to be searched for functions.
 -

 Key: PIG-112
 URL: https://issues.apache.org/jira/browse/PIG-112
 Project: Pig
  Issue Type: Improvement
Reporter: Craig Macdonald
Priority: Minor
 Attachments: import.v2.patch, import.v3.patch, patch.import

   Original Estimate: 1h
  Remaining Estimate: 1h

 Motivation: Why should a user have to supply the fully qualified name to his 
 user defined function, if all the functions he ever uses are in a pre-defined 
 list of packages?
 Currently PigContext has a list of Java packages that are searched for 
 functions. Currently the list consists of the following 
 packages:
  * org.apache.pig.builtin.
  * org.apache.pig.impl.builtin.
  * com.yahoo.pig.yst.sds.ULT.
 Users should be able to add to this packagelist.
 Example in Grunt:
 {noformat}
 include com.pig.function.package.
 {noformat}
 This has connections to imports in Java classes, namespaces in XQuery, and 
 the PATH env var in shell scripting.
 Should be trivial to implement, or could be a property.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-116) pig leaves temp files behind

2010-04-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-116.


Resolution: Won't Fix

We are not seeing this as causing problems since the data does not get cleaned 
under very rare circumstances

 pig leaves temp files behind
 

 Key: PIG-116
 URL: https://issues.apache.org/jira/browse/PIG-116
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich

 Currently, pig creates temp dirs via call to FileLocalizer.getTemporaryPath. 
 They are created on the client and are mainly used to store data between 2 
 M-R jobs. Pig then attempts to clean them up in the client's shutdown hook. 
 The problem with this approach is that, because there is now way to order the 
 shutdown hooks, in some cases, the DFS is already closed when we try to 
 delete the files in which case a substention amount of data can be left in 
 DFS. I see this issue more frequently with hadoop 0.16 perhaps because I had 
 to add an extra shutdown hook to handle hod disconnects.
 The short term, I would like to propose the approach below:
 (1) If trash is configured on the cluster, use trash location to create temp 
 directory that will expire in 7 days. The hope is that most jobs don't run 
 longer that 7 days. The user can specify a longer interval via a command line 
 switch
 (2) If trash is not enabled on the cluster, the location that we use now will 
 be used
 (3) In the shutdown hook, we will attempt to cleanup. If the attempt fails 
 and trash is enabled, we let trash handle it; otherwise we provide the list 
 of locations to the user to clean. (I realize that this is not ideal but 
 could not figure out a better way.)
 Longer term, I am talking with hadoop team to have better temp file support: 
 https://issues.apache.org/jira/browse/HADOOP-2815
 Comments? Suggestions?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.