[jira] Updated: (PIG-1391) pig unit tests leave behind files in temp directory because MiniCluster files don't get deleted
[ https://issues.apache.org/jira/browse/PIG-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1391: --- Attachment: PIG-1391.06.2.patch New patch for 0.6 branch. Made a change to build.xml - A new dir will be created within then temp dir and this dir will be used as the temp dir for tests. After the tests are done, this dir will be deleted, so that no temp files are left behind. pig unit tests leave behind files in temp directory because MiniCluster files don't get deleted --- Key: PIG-1391 URL: https://issues.apache.org/jira/browse/PIG-1391 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.7.0 Attachments: minicluster.patch, PIG-1391.06.2.patch, PIG-1391.06.patch Pig unit test runs leave behind files in temp dir (/tmp) and there are too many files in the directory over time. Most of the files are left behind by MiniCluster . It closes/shutsdown MiniDFSCluster, MiniMRCluster and the FileSystem that it has created when the constructor is called, only in finalize(). And java does not guarantee that finalize() will be called. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1378: Status: Patch Available (was: Open) har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1378: Attachment: PIG-1378-3.patch Looks like the golden file change from the last patch was not correct - updated patch with just that change attached - all unit tests ran successfully locally with this new patch - patch is ready for review har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1378: Status: Open (was: Patch Available) har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment
[jira] Updated: (PIG-1396) eclipse-files target in build.xml fails to generate necessary classes in src-gen
[ https://issues.apache.org/jira/browse/PIG-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1396: --- Status: Patch Available (was: Open) eclipse-files target in build.xml fails to generate necessary classes in src-gen Key: PIG-1396 URL: https://issues.apache.org/jira/browse/PIG-1396 Project: Pig Issue Type: Bug Components: build Affects Versions: 0.8.0 Reporter: V.V.Chaitanya Krishna Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1396.patch With PIG-1390 checked in, ant eclipse-files does only the ivy-download part but does not compile and generate the necessary classes in src-gen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1378: Status: Open (was: Patch Available) har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj -- This message is automatically generated by JIRA. - You can reply to this
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1378: Attachment: PIG-1378-4.patch Realized that a stray change in TestMRCompiler got into my previous patch - attaching new patch with just that change removed. har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at
[jira] Commented: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862405#action_12862405 ] Richard Ding commented on PIG-1378: --- +1 har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj -- This message is automatically generated by JIRA. - You can
[jira] Commented: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862417#action_12862417 ] Hadoop QA commented on PIG-1378: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443202/PIG-1378-3.patch against trunk revision 938733. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 64 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/308/console This message is automatically generated. har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Fix For: 0.8.0 Attachments: PIG-1378-2.patch, PIG-1378-3.patch, PIG-1378-4.patch, PIG-1378.patch I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more
[jira] Commented: (PIG-1396) eclipse-files target in build.xml fails to generate necessary classes in src-gen
[ https://issues.apache.org/jira/browse/PIG-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862418#action_12862418 ] Hadoop QA commented on PIG-1396: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443165/PIG-1396.patch against trunk revision 938733. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/304/console This message is automatically generated. eclipse-files target in build.xml fails to generate necessary classes in src-gen Key: PIG-1396 URL: https://issues.apache.org/jira/browse/PIG-1396 Project: Pig Issue Type: Bug Components: build Affects Versions: 0.8.0 Reporter: V.V.Chaitanya Krishna Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1396.patch With PIG-1390 checked in, ant eclipse-files does only the ivy-download part but does not compile and generate the necessary classes in src-gen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-740) Incorrect line number is generated when a string with double quotes is used instead of single quotes and is passed to UDF
[ https://issues.apache.org/jira/browse/PIG-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-740: --- Attachment: PIG-740.patch GruntParser was not handling double quotes within foreach blocks correctly by incorrectly treating them the same way as single quote for the starting double quote and not handling the end double quote - the patch addresses the bug by treating double quotes correctly. Incorrect line number is generated when a string with double quotes is used instead of single quotes and is passed to UDF -- Key: PIG-740 URL: https://issues.apache.org/jira/browse/PIG-740 Project: Pig Issue Type: Bug Components: grunt Affects Versions: 0.3.0 Reporter: Viraj Bhat Priority: Minor Attachments: PIG-740.patch Consider the Pig script with the error that a String with double quotes {code}www\\.{code} is used instead of a single quote {code}'www\\.'{code} in the UDF string.REPLACEALL() {code} register string-2.0.jar; A = load 'inputdata' using PigStorage() as ( curr_searchQuery ); B = foreach A { domain = string.REPLACEALL(curr_searchQuery,^www\\.,''); generate domain; }; dump B; {code} I get the following error message where Line 11 points to the end of file. The error message should point to Line 5. === 2009-03-31 01:33:38,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:9000 2009-03-31 01:33:39,168 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001 2009-03-31 01:33:39,589 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at line 11, column 0. Encountered: EOF after : Details at logfile: /home/viraj/pig-svn/trunk/pig_1238463218046.log === The log file contains the following contents === ERROR 1000: Error during parsing. Lexical error at line 11, column 0. Encountered: EOF after : org.apache.pig.tools.pigscript.parser.TokenMgrError: Lexical error at line 11, column 0. Encountered: EOF after : at org.apache.pig.tools.pigscript.parser.PigScriptParserTokenManager.getNextToken(PigScriptParserTokenManager.java:2739) at org.apache.pig.tools.pigscript.parser.PigScriptParser.jj_ntk(PigScriptParser.java:778) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:89) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88) at org.apache.pig.Main.main(Main.java:352) === -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1397) GruntParser should invoke executeBatch() first in processFsCommand()
GruntParser should invoke executeBatch() first in processFsCommand() Key: PIG-1397 URL: https://issues.apache.org/jira/browse/PIG-1397 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Pradeep Kamath Fix For: 0.8.0 If a script has multiple stores which can be optimized using multiquery optimization and if the script also has some file system state modifiying commands like cp, mv, rm then currently Gruntparser executes the plan until the filesystem command so that multiquery optimization will work on the file system after it has been modified (for example some portion of the multi query optimized script might be depending on the cp/mv or rm command to have run first). This is not done for fs ... commands - GruntParser should do the same even for fs .. commands in processFsCommand() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1394) POCombinerPackage hold too much memory for InternalCachedBag
[ https://issues.apache.org/jira/browse/PIG-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862430#action_12862430 ] Pradeep Kamath commented on PIG-1394: - +1 POCombinerPackage hold too much memory for InternalCachedBag Key: PIG-1394 URL: https://issues.apache.org/jira/browse/PIG-1394 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1394-1.patch, PIG-1394-2.patch In POCombinerPackage, we create bunch of InternalCachedBag, the number of which is the number of algebraic UDFs we use. However, when we create InternalCachedBag, we use the default construct which assume we only create 1 InternalCachedBag in the system. It turns out we reserve way to much memory to InternalCachedBag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
pig 0.6
Is anyone running unpatched 0.6 anywhere? I am in the process of putting together a jar for us, and getting worried about all the PruneColumns optimization fixes that came after the 0.6 release. -D
[jira] Created: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package
Marking Pig interfaces for org.apache.pig.data package -- Key: PIG-1398 URL: https://issues.apache.org/jira/browse/PIG-1398 Project: Pig Issue Type: Bug Components: documentation Affects Versions: 0.8.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Minor Fix For: 0.8.0 Marking Pig interfaces for stability and audience, as well as javadoc cleanup, for the data package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package
[ https://issues.apache.org/jira/browse/PIG-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1398: Parent: PIG-1311 Issue Type: Sub-task (was: Bug) Marking Pig interfaces for org.apache.pig.data package -- Key: PIG-1398 URL: https://issues.apache.org/jira/browse/PIG-1398 Project: Pig Issue Type: Sub-task Components: documentation Affects Versions: 0.8.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Minor Fix For: 0.8.0 Marking Pig interfaces for stability and audience, as well as javadoc cleanup, for the data package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1398) Marking Pig interfaces for org.apache.pig.data package
[ https://issues.apache.org/jira/browse/PIG-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1398: Attachment: PIG-1398.patch Marking Pig interfaces for org.apache.pig.data package -- Key: PIG-1398 URL: https://issues.apache.org/jira/browse/PIG-1398 Project: Pig Issue Type: Sub-task Components: documentation Affects Versions: 0.8.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Minor Fix For: 0.8.0 Attachments: PIG-1398.patch Marking Pig interfaces for stability and audience, as well as javadoc cleanup, for the data package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-62) Need to add pig script and input dirs (in clear text format) to jobconf
[ https://issues.apache.org/jira/browse/PIG-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-62: -- Fix Version/s: 0.8.0 Additional information including this one will be added to job.xml Need to add pig script and input dirs (in clear text format) to jobconf --- Key: PIG-62 URL: https://issues.apache.org/jira/browse/PIG-62 Project: Pig Issue Type: Improvement Reporter: Runping Qi Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-62-v01.patch, PIG-62-v02.patch, PIG-62-v03.patch We'd like to know what pig query each pig related map/reduce job is for, and what input dir the pig job use. It will be nice to add pig script and input dirs (in clear text format) to the jobconfs of the jobs created for a pig script. . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-112) Allow Java packages to be searched for functions.
[ https://issues.apache.org/jira/browse/PIG-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-112: --- Status: Resolved (was: Patch Available) Resolution: Fixed This has been done as part of Pig 0.6.0 release Allow Java packages to be searched for functions. - Key: PIG-112 URL: https://issues.apache.org/jira/browse/PIG-112 Project: Pig Issue Type: Improvement Reporter: Craig Macdonald Priority: Minor Attachments: import.v2.patch, import.v3.patch, patch.import Original Estimate: 1h Remaining Estimate: 1h Motivation: Why should a user have to supply the fully qualified name to his user defined function, if all the functions he ever uses are in a pre-defined list of packages? Currently PigContext has a list of Java packages that are searched for functions. Currently the list consists of the following packages: * org.apache.pig.builtin. * org.apache.pig.impl.builtin. * com.yahoo.pig.yst.sds.ULT. Users should be able to add to this packagelist. Example in Grunt: {noformat} include com.pig.function.package. {noformat} This has connections to imports in Java classes, namespaces in XQuery, and the PATH env var in shell scripting. Should be trivial to implement, or could be a property. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-116) pig leaves temp files behind
[ https://issues.apache.org/jira/browse/PIG-116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-116. Resolution: Won't Fix We are not seeing this as causing problems since the data does not get cleaned under very rare circumstances pig leaves temp files behind Key: PIG-116 URL: https://issues.apache.org/jira/browse/PIG-116 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Currently, pig creates temp dirs via call to FileLocalizer.getTemporaryPath. They are created on the client and are mainly used to store data between 2 M-R jobs. Pig then attempts to clean them up in the client's shutdown hook. The problem with this approach is that, because there is now way to order the shutdown hooks, in some cases, the DFS is already closed when we try to delete the files in which case a substention amount of data can be left in DFS. I see this issue more frequently with hadoop 0.16 perhaps because I had to add an extra shutdown hook to handle hod disconnects. The short term, I would like to propose the approach below: (1) If trash is configured on the cluster, use trash location to create temp directory that will expire in 7 days. The hope is that most jobs don't run longer that 7 days. The user can specify a longer interval via a command line switch (2) If trash is not enabled on the cluster, the location that we use now will be used (3) In the shutdown hook, we will attempt to cleanup. If the attempt fails and trash is enabled, we let trash handle it; otherwise we provide the list of locations to the user to clean. (I realize that this is not ideal but could not figure out a better way.) Longer term, I am talking with hadoop team to have better temp file support: https://issues.apache.org/jira/browse/HADOOP-2815 Comments? Suggestions? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.