[jira] Updated: (PIG-1336) Optimize POStore serialized into JobConf
[ https://issues.apache.org/jira/browse/PIG-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1336: Description: We serialize POStore too early in the JobControlCompiler. At that time, storeFunc have unconstraint link to other operator; in the worst case, it will chain the whole physical plan. Also, in multi-store case, POStore has link to its data source, which is not needed and will increase the footprint of serialized POStore. Worse, it may cause problem if we do not optimize POStore. If we have two map-reduce job, the first job need a LoadFunc from an external jar. The first job will ship the jar to backend but the second job will not. However, since POStore of second job has a link chain to the LoadFunc of the first job, to deserialize it, we need that external jar. Since we do not ship the external jar for the second map-reduce job, we die in this case. So it is more than an optimization, it is also a bug fix. was:We serialize POStore too early in the JobControlCompiler. At that time, storeFunc have unconstraint link to other operator; in the worst case, it will chain the whole physical plan. Also, in multi-store case, POStore has link to its data source, which is not needed and will increase the footprint of serialized POStore. Optimize POStore serialized into JobConf Key: PIG-1336 URL: https://issues.apache.org/jira/browse/PIG-1336 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1336-1.patch, PIG-1336-2.patch, PIG-1336-3.patch, PIG-1336-4.patch We serialize POStore too early in the JobControlCompiler. At that time, storeFunc have unconstraint link to other operator; in the worst case, it will chain the whole physical plan. Also, in multi-store case, POStore has link to its data source, which is not needed and will increase the footprint of serialized POStore. Worse, it may cause problem if we do not optimize POStore. If we have two map-reduce job, the first job need a LoadFunc from an external jar. The first job will ship the jar to backend but the second job will not. However, since POStore of second job has a link chain to the LoadFunc of the first job, to deserialize it, we need that external jar. Since we do not ship the external jar for the second map-reduce job, we die in this case. So it is more than an optimization, it is also a bug fix. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (PIG-1378) har url not usable in Pig scripts
har url not usable in Pig scripts - Key: PIG-1378 URL: https://issues.apache.org/jira/browse/PIG-1378 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat Fix For: 0.7.0 I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: mithrilgold at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1378) har url not usable in Pig scripts
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1378: Description: I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs 2010-04-14 22:08:48,814 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file URI scheme: har : hdfs at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249) at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472) ... 13 more {noformat} According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated in the original description {noformat} grunt a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data'; ... 8 more Caused by: java.io.IOException: No FileSystem for scheme: namenode-location at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375) at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66) at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104) at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193) at .apache.hadoop.fs.Path.getFileSystem(Path.java:175) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245) {noformat} Viraj was: I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-1 {noformat} Using similar URL's in grunt yields {noformat} grunt a = load 'har:///user/viraj/project/subproject/files/size/data'; grunt dump a; {noformat} {noformat} 2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0:
[jira] Commented: (PIG-939) Checkstyle pulls in junit3.7 which causes the build of test code to fail.
[ https://issues.apache.org/jira/browse/PIG-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857132#action_12857132 ] Hadoop QA commented on PIG-939: --- To test jira cli Checkstyle pulls in junit3.7 which causes the build of test code to fail. - Key: PIG-939 URL: https://issues.apache.org/jira/browse/PIG-939 Project: Pig Issue Type: Bug Components: build Affects Versions: 0.3.0 Reporter: Lee Tucker Assignee: Giridharan Kesavan Fix For: 0.4.0 Attachments: pig-939.patch Pig fails to compile if you execute: ant -Dassociated flags for various components clean findbugs checkstyle test It gets the error: [javac] Compiling 153 source files to /export/crawlspace/kryptonite/hadoopqa/workspace/workspace/CCDI-Pig-2.3/pig-2.3.0.0.20.0.2967040009/build/test/classes [javac] /export/crawlspace/kryptonite/hadoopqa/workspace/workspace/CCDI-Pig-2.3/pig-2.3.0.0.20.0.2967040009/test/org/apache/pig/test/PigExecTestCase.java:31: cannot find symbol [javac] symbol : constructor TestCase() [javac] location: class junit.framework.TestCase [javac] public abstract class PigExecTestCase extends TestCase { [javac] ^ Once that's done, there's a copy of junit 3.7 cached from ivy that will continue to cause the build to fail. It will succeed, if you remove it, and then do: ant -Dassociated flags for various components clean findbugs test This proves it's running checkstyle that pulls in junit 3.7 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-939) Checkstyle pulls in junit3.7 which causes the build of test code to fail.
[ https://issues.apache.org/jira/browse/PIG-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857136#action_12857136 ] Hadoop QA commented on PIG-939: --- To test jira cli Checkstyle pulls in junit3.7 which causes the build of test code to fail. - Key: PIG-939 URL: https://issues.apache.org/jira/browse/PIG-939 Project: Pig Issue Type: Bug Components: build Affects Versions: 0.3.0 Reporter: Lee Tucker Assignee: Giridharan Kesavan Fix For: 0.4.0 Attachments: pig-939.patch Pig fails to compile if you execute: ant -Dassociated flags for various components clean findbugs checkstyle test It gets the error: [javac] Compiling 153 source files to /export/crawlspace/kryptonite/hadoopqa/workspace/workspace/CCDI-Pig-2.3/pig-2.3.0.0.20.0.2967040009/build/test/classes [javac] /export/crawlspace/kryptonite/hadoopqa/workspace/workspace/CCDI-Pig-2.3/pig-2.3.0.0.20.0.2967040009/test/org/apache/pig/test/PigExecTestCase.java:31: cannot find symbol [javac] symbol : constructor TestCase() [javac] location: class junit.framework.TestCase [javac] public abstract class PigExecTestCase extends TestCase { [javac] ^ Once that's done, there's a copy of junit 3.7 cached from ivy that will continue to cause the build to fail. It will succeed, if you remove it, and then do: ant -Dassociated flags for various components clean findbugs test This proves it's running checkstyle that pulls in junit 3.7 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1372) Restore PigInputFormat.sJob for backward compatibility
[ https://issues.apache.org/jira/browse/PIG-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857137#action_12857137 ] Olga Natkovich commented on PIG-1372: - +1 Restore PigInputFormat.sJob for backward compatibility -- Key: PIG-1372 URL: https://issues.apache.org/jira/browse/PIG-1372 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.7.0 Attachments: PIG-1372.patch The preferred method to get the job's Configuration object would be to use UDFContext.getJobConf(). This jira is to restore PigInputFormat.sJob (but we will be marking it deprecated and indicating to use UDFContext.getJobConf() instead) to be backward compatible - we can remove it from pig in a future release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1363) Unnecessary loadFunc instantiations
[ https://issues.apache.org/jira/browse/PIG-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857139#action_12857139 ] Pradeep Kamath commented on PIG-1363: - +1 Unnecessary loadFunc instantiations --- Key: PIG-1363 URL: https://issues.apache.org/jira/browse/PIG-1363 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: pig-1363.patch In MRCompiler loadfuncs are instantiated at multiple locations in different visit methods. This is inconsistent and confusing. LoadFunc should be instantiated at only one place, ideally in LogToPhyTanslation#visit(LOLoad). A getter should be added to POLoad to retrieve this instantiated loadFunc wherever it is needed in later stages of compilation. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1372) Restore PigInputFormat.sJob for backward compatibility
[ https://issues.apache.org/jira/browse/PIG-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1372: Attachment: PIG-1372-2.patch Regenerated patch against latest trunk (same changes). Here are the results of running test-patch ant target: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] Restore PigInputFormat.sJob for backward compatibility -- Key: PIG-1372 URL: https://issues.apache.org/jira/browse/PIG-1372 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.7.0 Attachments: PIG-1372-2.patch, PIG-1372.patch The preferred method to get the job's Configuration object would be to use UDFContext.getJobConf(). This jira is to restore PigInputFormat.sJob (but we will be marking it deprecated and indicating to use UDFContext.getJobConf() instead) to be backward compatible - we can remove it from pig in a future release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1372) Restore PigInputFormat.sJob for backward compatibility
[ https://issues.apache.org/jira/browse/PIG-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1372: Status: Patch Available (was: Open) Restore PigInputFormat.sJob for backward compatibility -- Key: PIG-1372 URL: https://issues.apache.org/jira/browse/PIG-1372 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.7.0 Attachments: PIG-1372-2.patch, PIG-1372.patch The preferred method to get the job's Configuration object would be to use UDFContext.getJobConf(). This jira is to restore PigInputFormat.sJob (but we will be marking it deprecated and indicating to use UDFContext.getJobConf() instead) to be backward compatible - we can remove it from pig in a future release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1372) Restore PigInputFormat.sJob for backward compatibility
[ https://issues.apache.org/jira/browse/PIG-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1372: Status: Open (was: Patch Available) Restore PigInputFormat.sJob for backward compatibility -- Key: PIG-1372 URL: https://issues.apache.org/jira/browse/PIG-1372 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.7.0 Attachments: PIG-1372-2.patch, PIG-1372.patch The preferred method to get the job's Configuration object would be to use UDFContext.getJobConf(). This jira is to restore PigInputFormat.sJob (but we will be marking it deprecated and indicating to use UDFContext.getJobConf() instead) to be backward compatible - we can remove it from pig in a future release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1353) Map-side joins
[ https://issues.apache.org/jira/browse/PIG-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857149#action_12857149 ] Ashutosh Chauhan commented on PIG-1353: --- Hudson.. Oh Hudson.. when y'll get better ! Ran the full test suite. All of them passed. Ran test-patch: {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 12 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} Patch is ready for review. Map-side joins -- Key: PIG-1353 URL: https://issues.apache.org/jira/browse/PIG-1353 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: pig-1353.patch, pig-1353.patch Pig already has couple of map-side join implementations: Merge Join and Fragmented-Replicate Join. But both of them are pretty restrictive. Merge Join can only join two tables and that too can only do inner join. FR Join can join multiple relations, but it can also only do inner and left outer joins. Further it restricts the sizes of side relations. It will be nice if we can do map side joins on multiple tables as well do inner, left outer, right outer and full outer joins. Lot of groundwork for this has already been done in PIG-1309. Remaining will be tracked in this jira. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1370) Marking Pig interfaces for org.apache.pig package
[ https://issues.apache.org/jira/browse/PIG-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1370: Status: Resolved (was: Patch Available) Resolution: Fixed Changed SortInfo and SortColInfo to private as indicated in Pradeep's comments. Marked constructor of ResourceSchema that uses SortInfo as private as well. Patch checked in. Marking Pig interfaces for org.apache.pig package - Key: PIG-1370 URL: https://issues.apache.org/jira/browse/PIG-1370 Project: Pig Issue Type: Sub-task Components: documentation Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.8.0 Attachments: PIG-1370.patch, PIG-1370_2.patch Done as a separate JIRA from PIG-1311 since this alone contains quite a lot of changes. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1363) Unnecessary loadFunc instantiations
[ https://issues.apache.org/jira/browse/PIG-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1363: -- Status: Resolved (was: Patch Available) Resolution: Fixed Patch checked-in. Unnecessary loadFunc instantiations --- Key: PIG-1363 URL: https://issues.apache.org/jira/browse/PIG-1363 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: pig-1363.patch In MRCompiler loadfuncs are instantiated at multiple locations in different visit methods. This is inconsistent and confusing. LoadFunc should be instantiated at only one place, ideally in LogToPhyTanslation#visit(LOLoad). A getter should be added to POLoad to retrieve this instantiated loadFunc wherever it is needed in later stages of compilation. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857154#action_12857154 ] Ashutosh Chauhan commented on PIG-1229: --- As per http://www.mail-archive.com/pig-u...@hadoop.apache.org/msg02257.html thread I am wondering if it will be safe and possible to make sure that job using this storage has speculative execution turned-off. Otherwise, with S.E. turned on, there are too many scenarios we would have to handle. What do you think? allow pig to write output into a JDBC db Key: PIG-1229 URL: https://issues.apache.org/jira/browse/PIG-1229 Project: Pig Issue Type: New Feature Components: impl Reporter: Ian Holsman Assignee: Ankur Priority: Minor Fix For: 0.8.0 Attachments: jira-1229-v2.patch UDF to store data into a DB -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-518) LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag
[ https://issues.apache.org/jira/browse/PIG-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857157#action_12857157 ] Viraj Bhat commented on PIG-518: The above script generates the following error in Pig 0.7 2010-04-14 17:10:49,807 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1048: Two inputs of BinCond must have compatible schemas. left hand side: b: bag({colb2: bytearray,colb3: bytearray}) right hand side: bag({(chararray,chararray)}) A type cast to the right type solves the problem. {code} a = load 'sports_views.txt' as (col1:chararray, col2:chararray, col3:chararray); b = load 'queries.txt' as (colb1:chararray,colb2:chararray,colb3:chararray); mycogroup = cogroup a by col1 inner, b by colb1; mynewalias = foreach mycogroup generate flatten(a), flatten((COUNT(b) 0L ? b.(colb2,colb3) : {('','')})); dump mynewalias; {code} (alice,lakers,3,ipod,3) (alice,warriors,7,ipod,3) (peter,sun,7,sun,4) (peter,nets,7,sun,4) Closing bug as Pig yields the correct error message which the user can use to recode his script LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag --- Key: PIG-518 URL: https://issues.apache.org/jira/browse/PIG-518 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.2.0 Reporter: Viraj Bhat Attachments: queries.txt, sports_views.txt The following piece of Pig script, which provides default values for bags {('','')} when the COUNT returns 0 fails with the following error. (Note: Files used in this script are enclosed on this Jira.) a = load 'sports_views.txt' as (col1, col2, col3); b = load 'queries.txt' as (colb1,colb2,colb3); mycogroup = cogroup a by col1 inner, b by colb1; mynewalias = foreach mycogroup generate flatten(a), flatten((COUNT(b) 0L ? b.(colb2,colb3) : {('','')})); dump mynewalias; java.io.IOException: Unable to open iterator for alias: mynewalias [Unable to store for alias: mynewalias [Can't overwrite cause]] at java.lang.Throwable.initCause(Throwable.java:320) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:1494) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:85) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:28) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.checkInnerPlan(TypeCheckingVisitor.java:2345) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2252) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) at org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java: 79) at org.apache.pig.PigServer.compileLp(PigServer.java:684) at org.apache.pig.PigServer.compileLp(PigServer.java:655) at org.apache.pig.PigServer.store(PigServer.java:433) at org.apache.pig.PigServer.store(PigServer.java:421) at org.apache.pig.PigServer.openIterator(PigServer.java:384) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) at org.apache.pig.Main.main(Main.java:306) Caused by: java.io.IOException: Unable to store for alias: mynewalias [Can't overwrite cause] ... 26 more Caused by: java.lang.IllegalStateException: Can't overwrite cause ... 26 more -- This message is automatically generated by
[jira] Resolved: (PIG-518) LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag
[ https://issues.apache.org/jira/browse/PIG-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat resolved PIG-518. Fix Version/s: 0.7.0 Resolution: Fixed LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag --- Key: PIG-518 URL: https://issues.apache.org/jira/browse/PIG-518 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.2.0 Reporter: Viraj Bhat Fix For: 0.7.0 Attachments: queries.txt, sports_views.txt The following piece of Pig script, which provides default values for bags {('','')} when the COUNT returns 0 fails with the following error. (Note: Files used in this script are enclosed on this Jira.) a = load 'sports_views.txt' as (col1, col2, col3); b = load 'queries.txt' as (colb1,colb2,colb3); mycogroup = cogroup a by col1 inner, b by colb1; mynewalias = foreach mycogroup generate flatten(a), flatten((COUNT(b) 0L ? b.(colb2,colb3) : {('','')})); dump mynewalias; java.io.IOException: Unable to open iterator for alias: mynewalias [Unable to store for alias: mynewalias [Can't overwrite cause]] at java.lang.Throwable.initCause(Throwable.java:320) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:1494) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:85) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:28) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.checkInnerPlan(TypeCheckingVisitor.java:2345) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2252) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) at org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java: 79) at org.apache.pig.PigServer.compileLp(PigServer.java:684) at org.apache.pig.PigServer.compileLp(PigServer.java:655) at org.apache.pig.PigServer.store(PigServer.java:433) at org.apache.pig.PigServer.store(PigServer.java:421) at org.apache.pig.PigServer.openIterator(PigServer.java:384) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) at org.apache.pig.Main.main(Main.java:306) Caused by: java.io.IOException: Unable to store for alias: mynewalias [Can't overwrite cause] ... 26 more Caused by: java.lang.IllegalStateException: Can't overwrite cause ... 26 more -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (PIG-829) DECLARE statement stop processing after special characters such as dot . , + % etc..
[ https://issues.apache.org/jira/browse/PIG-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat resolved PIG-829. Fix Version/s: 0.7.0 Resolution: Fixed Pig 0.7 yields the correct result. {code} x = LOAD 'something' as (a:chararray, b:chararray); y = FILTER x BY ( a MATCHES '^.*yahoo.*$' ); STORE y INTO 'foo.bar'; {code} DECLARE statement stop processing after special characters such as dot . , + % etc.. -- Key: PIG-829 URL: https://issues.apache.org/jira/browse/PIG-829 Project: Pig Issue Type: Bug Components: grunt Affects Versions: 0.3.0 Reporter: Viraj Bhat Fix For: 0.7.0 The below Pig script does not work well, when special characters are used in the DECLARE statement. {code} %DECLARE OUT foo.bar x = LOAD 'something' as (a:chararray, b:chararray); y = FILTER x BY ( a MATCHES '^.*yahoo.*$' ); STORE y INTO '$OUT'; {code} When the above script is run in the dry run mode; the substituted file does not contain the special character. {code} java -cp pig.jar:/homes/viraj/hadoop-0.18.0-dev/conf -Dhod.server='' org.apache.pig.Main -r declaresp.pig {code} Resulting file: declaresp.pig.substituted {code} x = LOAD 'something' as (a:chararray, b:chararray); y = FILTER x BY ( a MATCHES '^.*yahoo.*$' ); STORE y INTO 'foo'; {code} -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira