[jira] Commented: (PIG-935) Skewed join throws an exception when used with map keys
[ https://issues.apache.org/jira/browse/PIG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748201#action_12748201 ] Hadoop QA commented on PIG-935: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417766/skjoinmapbug.patch against trunk revision 806668. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/console This message is automatically generated. > Skewed join throws an exception when used with map keys > --- > > Key: PIG-935 > URL: https://issues.apache.org/jira/browse/PIG-935 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath > Attachments: skjoinmapbug.patch > > > Skewed join throws a runtime exception for the following query: > A = load 'map.txt' as (e); > B = load 'map.txt' as (f); > C = join A by (chararray)e#'a', B by (chararray)f#'a' using "skewed"; > explain C; > Exception: > Caused by: java.lang.ClassCastException: > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast > cannot be cast to > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO > Project > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894) > ... 27 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-Patch-minerva.apache.org #179
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/ -- [...truncated 112853 lines...] [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:36021 is added to blk_2072952556046770198_1011 size 6 [exec] [junit] 09/08/26 23:20:26 INFO dfs.DataNode: PacketResponder 2 for block blk_2072952556046770198_1011 terminating [exec] [junit] 09/08/26 23:20:26 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://localhost:33549 [exec] [junit] 09/08/26 23:20:26 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: localhost:59257 [exec] [junit] 09/08/26 23:20:26 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1 [exec] [junit] 09/08/26 23:20:26 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1 [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* ask 127.0.0.1:36021 to delete blk_1407819332858915959_1006 blk_-3865270409021269559_1005 blk_-5210738776223878005_1004 [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* ask 127.0.0.1:43725 to delete blk_1407819332858915959_1006 blk_-5210738776223878005_1004 [exec] [junit] 09/08/26 23:20:26 WARN dfs.DataNode: Unexpected error trying to delete block blk_-5210738776223878005_1004. BlockInfo not found in volumeMap. [exec] [junit] 09/08/26 23:20:26 INFO dfs.DataNode: Deleting block blk_1407819332858915959_1006 file dfs/data/data7/current/blk_1407819332858915959 [exec] [junit] 09/08/26 23:20:26 WARN dfs.DataNode: java.io.IOException: Error in deleting blocks. [exec] [junit] at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:1146) [exec] [junit] at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:793) [exec] [junit] at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:663) [exec] [junit] at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2888) [exec] [junit] at java.lang.Thread.run(Thread.java:619) [exec] [junit] [exec] [junit] 09/08/26 23:20:27 INFO mapReduceLayer.JobControlCompiler: Setting up single store job [exec] [junit] 09/08/26 23:20:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. blk_347023940785646545_1012 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block blk_347023940785646545_1012 src: /127.0.0.1:55940 dest: /127.0.0.1:43941 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block blk_347023940785646545_1012 src: /127.0.0.1:49064 dest: /127.0.0.1:36021 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block blk_347023940785646545_1012 src: /127.0.0.1:42096 dest: /127.0.0.1:43725 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block blk_347023940785646545_1012 of size 1498963 from /127.0.0.1 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 0 for block blk_347023940785646545_1012 terminating [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:43725 is added to blk_347023940785646545_1012 size 1498963 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block blk_347023940785646545_1012 of size 1498963 from /127.0.0.1 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 1 for block blk_347023940785646545_1012 terminating [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:36021 is added to blk_347023940785646545_1012 size 1498963 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block blk_347023940785646545_1012 of size 1498963 from /127.0.0.1 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:43941 is added to blk_347023940785646545_1012 size 1498963 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 2 for block blk_347023940785646545_1012 terminating [exec] [junit] 09/08/26 23:20:27 INFO fs.FSNamesystem: Increasing replication for file /tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. New replication is 2 [exec] [junit] 09/08/26 23:20:27 INFO fs.FSNamesystem: Reducing replication for file /tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. New replication is 2 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* NameS
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748131#action_12748131 ] Hadoop QA commented on PIG-922: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417760/PIG-922-p1_4.patch against trunk revision 806668. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hudson build is back to normal: Pig-Patch-minerva.apache.org #178
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/
[jira] Updated: (PIG-935) Skewed join throws an exception when used with map keys
[ https://issues.apache.org/jira/browse/PIG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-935: Status: Patch Available (was: Open) The attached patch solves this issue. > Skewed join throws an exception when used with map keys > --- > > Key: PIG-935 > URL: https://issues.apache.org/jira/browse/PIG-935 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath > Attachments: skjoinmapbug.patch > > > Skewed join throws a runtime exception for the following query: > A = load 'map.txt' as (e); > B = load 'map.txt' as (f); > C = join A by (chararray)e#'a', B by (chararray)f#'a' using "skewed"; > explain C; > Exception: > Caused by: java.lang.ClassCastException: > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast > cannot be cast to > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO > Project > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894) > ... 27 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-935) Skewed join throws an exception when used with map keys
Skewed join throws an exception when used with map keys --- Key: PIG-935 URL: https://issues.apache.org/jira/browse/PIG-935 Project: Pig Issue Type: Bug Reporter: Sriranjan Manjunath Attachments: skjoinmapbug.patch Skewed join throws a runtime exception for the following query: A = load 'map.txt' as (e); B = load 'map.txt' as (f); C = join A by (chararray)e#'a', B by (chararray)f#'a' using "skewed"; explain C; Exception: Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast cannot be cast to org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO Project at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894) ... 27 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-935) Skewed join throws an exception when used with map keys
[ https://issues.apache.org/jira/browse/PIG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-935: Attachment: skjoinmapbug.patch > Skewed join throws an exception when used with map keys > --- > > Key: PIG-935 > URL: https://issues.apache.org/jira/browse/PIG-935 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath > Attachments: skjoinmapbug.patch > > > Skewed join throws a runtime exception for the following query: > A = load 'map.txt' as (e); > B = load 'map.txt' as (f); > C = join A by (chararray)e#'a', B by (chararray)f#'a' using "skewed"; > explain C; > Exception: > Caused by: java.lang.ClassCastException: > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast > cannot be cast to > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO > Project > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894) > ... 27 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-934) Merge join implementation currently does not seek to right point on the right side input based on the offset provided by the index
[ https://issues.apache.org/jira/browse/PIG-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748070#action_12748070 ] Pradeep Kamath commented on PIG-934: To get an idea of how this seeking in case of regular loads in map tasks, I looked at PigSlice.java, the seek happens in the init() code before bindTo(): {code} public void init(DataStorage base) throws IOException { .. fsis = base.asElement(base.getActiveContainer(), file).sopen(); fsis.seek(start, FLAGS.SEEK_CUR); end = start + getLength(); if (file.endsWith(".bz") || file.endsWith(".bz2")) { is = new CBZip2InputStream(fsis, 9); } else if (file.endsWith(".gz")) { is = new GZIPInputStream(fsis); // We can't tell how much of the underlying stream GZIPInputStream // has actually consumed end = Long.MAX_VALUE; } else { is = fsis; } loader.bindTo(file.toString(), new BufferedPositionedInputStream(is, start), start, end); } {code} I think we need a FileLocalizer.sOpenSingleFile() method which can return a SeekableInputStream and we can use that in setup() in POLoad. Something along the lines of : {code} static public InputStream open(String fileSpec, PigContext pigContext) throws IOException { fileSpec = checkDefaultPrefix(pigContext.getExecType(), fileSpec); if (!fileSpec.startsWith(LOCAL_PREFIX)) { init(pigContext); ElementDescriptor elem = pigContext.getDfs().asElement(fullPath(fileSpec, pigContext)); return elem.sopen(); } else { fileSpec = fileSpec.substring(LOCAL_PREFIX.length()); //buffering because we only want buffered streams to be passed to load functions. /*return new BufferedInputStream(new FileInputStream(fileSpec));*/ init(pigContext); ElementDescriptor elem = pigContext.getLfs().asElement(fullPath(fileSpec, pigContext)); return elem.sopen; } } {code} The above code would only work with single files and not dirs which should be ok for merge join. We should probably set this up with a new constructor in POLoad which also indicates that a single file is being processed. > Merge join implementation currently does not seek to right point on the right > side input based on the offset provided by the index > -- > > Key: PIG-934 > URL: https://issues.apache.org/jira/browse/PIG-934 > Project: Pig > Issue Type: Bug >Affects Versions: 0.3.1 >Reporter: Pradeep Kamath > > We use POLoad to seek into right file which has the following code: > {noformat} >public void setUp() throws IOException{ > String filename = lFile.getFileName(); > loader = > (LoadFunc)PigContext.instantiateFuncFromSpec(lFile.getFuncSpec()); > is = FileLocalizer.open(filename, pc); > loader.bindTo(filename , new BufferedPositionedInputStream(is), > this.offset, Long.MAX_VALUE); > } > {noformat} > Between opening the stream and bindTo we do not seek to the right offset. > bindTo itself does not perform any seek. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-934) Merge join implementation currently does not seek to right point on the right side input based on the offset provided by the index
Merge join implementation currently does not seek to right point on the right side input based on the offset provided by the index -- Key: PIG-934 URL: https://issues.apache.org/jira/browse/PIG-934 Project: Pig Issue Type: Bug Affects Versions: 0.3.1 Reporter: Pradeep Kamath We use POLoad to seek into right file which has the following code: {noformat} public void setUp() throws IOException{ String filename = lFile.getFileName(); loader = (LoadFunc)PigContext.instantiateFuncFromSpec(lFile.getFuncSpec()); is = FileLocalizer.open(filename, pc); loader.bindTo(filename , new BufferedPositionedInputStream(is), this.offset, Long.MAX_VALUE); } {noformat} Between opening the stream and bindTo we do not seek to the right offset. bindTo itself does not perform any seek. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Status: Open (was: Patch Available) > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.4.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Fix Version/s: (was: 0.4.0) Status: Patch Available (was: Open) > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Attachment: PIG-922-p1_4.patch Address findbug warnings. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: pig trunk build
Its only the trunk build that I moved to h7 AND NOT THE PATCH BUILDS. And this is not going to change anything from the dev point of view. Trunk build url still remains the same: http://hudson.zones.apache.org/hudson/view/Pig/job/Pig-trunk/ Patch builds still run on minerva and it has the same url as before. Will keep you posted as I move the patch builds. > -Original Message- > From: Pradeep Kamath [mailto:prade...@yahoo-inc.com] > Sent: Wednesday, August 26, 2009 10:37 PM > To: pig-dev@hadoop.apache.org > Subject: RE: pig trunk build > > What is the URL for the Hudson UI? I tried > http://hudson.zones.apache.org/hudson/job/Pig-Patch- > h7.grid.sp2.yahoo.ne > t but that did not work. > > Pradeep > > -Original Message- > From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] > Sent: Wednesday, August 26, 2009 7:41 AM > To: pig-dev@hadoop.apache.org > Subject: pig trunk build > > Pig trunk build is now moved from minerva.apache.org to > h7.grid.sp2.yahoo.net machine. > > tnx, > Giri
RE: pig trunk build
What is the URL for the Hudson UI? I tried http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.ne t but that did not work. Pradeep -Original Message- From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] Sent: Wednesday, August 26, 2009 7:41 AM To: pig-dev@hadoop.apache.org Subject: pig trunk build Pig trunk build is now moved from minerva.apache.org to h7.grid.sp2.yahoo.net machine. tnx, Giri
[jira] Created: (PIG-933) broken link in pig-latin reference manual to hadoop file glob pattern documentation
broken link in pig-latin reference manual to hadoop file glob pattern documentation --- Key: PIG-933 URL: https://issues.apache.org/jira/browse/PIG-933 Project: Pig Issue Type: Bug Components: documentation Reporter: Thejas M Nair Priority: Minor http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_LOAD has a link to http://lucene.apache.org/hadoop/api/org/apache/hadoop/fs/FileSystem.html#globPaths(org.apache.hadoop.fs.Path)the , it should be - http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
pig trunk build
Pig trunk build is now moved from minerva.apache.org to h7.grid.sp2.yahoo.net machine. tnx, Giri