Build failed in Hudson: Pig-trunk #633
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/633/changes Changes: [gates] PIG-1098 Zebra Performance Optimizations. -- [...truncated 2691 lines...] ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-buildJar: [ivy:resolve] :: resolving dependencies :: org.apache.pig#Pig;2009-12-02_10-05-59 [ivy:resolve] confs: [buildJar] [ivy:resolve] found com.jcraft#jsch;0.1.38 in maven2 [ivy:resolve] found jline#jline;0.9.94 in maven2 [ivy:resolve] found net.java.dev.javacc#javacc;4.2 in maven2 [ivy:resolve] found junit#junit;4.5 in default [ivy:resolve] :: resolution report :: resolve 70ms :: artifacts dl 4ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | buildJar | 4 | 0 | 0 | 0 || 4 | 0 | - [ivy:retrieve] :: retrieving :: org.apache.pig#Pig [ivy:retrieve] confs: [buildJar] [ivy:retrieve] 1 artifacts copied, 3 already retrieved (288kB/4ms) buildJar: [echo] svnString 886097 [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/pig-2009-12-02_10-05-59.jar [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk jarWithOutSvn: findbugs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs [findbugs] Executing findbugs from ant task [findbugs] Running FindBugs... [findbugs] The following classes needed for analysis were missing: [findbugs] com.jcraft.jsch.SocketFactory [findbugs] com.jcraft.jsch.Logger [findbugs] jline.Completor [findbugs] com.jcraft.jsch.Session [findbugs] com.jcraft.jsch.HostKeyRepository [findbugs] com.jcraft.jsch.JSch [findbugs] com.jcraft.jsch.UserInfo [findbugs] jline.ConsoleReaderInputStream [findbugs] com.jcraft.jsch.HostKey [findbugs] jline.ConsoleReader [findbugs] com.jcraft.jsch.ChannelExec [findbugs] jline.History [findbugs] com.jcraft.jsch.ChannelDirectTCPIP [findbugs] com.jcraft.jsch.JSchException [findbugs] com.jcraft.jsch.Channel [findbugs] Warnings generated: 20 [findbugs] Missing classes: 16 [findbugs] Calculating exit code... [findbugs] Setting 'missing class' flag (2) [findbugs] Setting 'bugs found' flag (1) [findbugs] Exit code set to: 3 [findbugs] Java Result: 3 [findbugs] Classes needed for analysis were missing [findbugs] Output saved to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml [xslt] Processing http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.html [xslt] Loading stylesheet /homes/gkesavan/tools/findbugs/latest/src/xsl/default.xsl BUILD SUCCESSFUL Total time: 2 minutes 55 seconds + mv build/pig-2009-12-02_10-05-59.tar.gz http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/test/findbugs http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/docs/api http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant clean Buildfile: build.xml clean: [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src-gen [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src/docs/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/test/org/apache/pig/test/utils/dotGraph/parser BUILD SUCCESSFUL Total time: 0 seconds + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant -Dtest.junit.output.format=xml -Dtest.output=yes -Dcheckstyle.home=/homes/hudson/tools/checkstyle/latest -Drun.clover=true -Dclover.home=/homes/hudson/tools/clover/latest clover test generate-clover-reports Buildfile: build.xml clover.setup: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db [clover-setup] Clover Version 2.4.3, built on March 09 2009 (build-756) [clover-setup] Loaded from: /homes/hudson/tools/clover/latest/lib/clover.jar [clover-setup] Clover: Open Source License registered to Apache. [clover-setup] Clover is enabled with initstring 'http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db/pig_coverage.db' clover.info: clover: test: ivy-download: [get] Getting:
[jira] Commented: (PIG-965) PERFORMANCE: optimize common case in matches (PORegex)
[ https://issues.apache.org/jira/browse/PIG-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784849#action_12784849 ] Thejas M Nair commented on PIG-965: --- In the above performance numbers, I assume optimization 2 (custom string comparison) is used only for the regex .*ABCD.* , while optimization 1 (re-using compiled pattern) is used with dk.brics.automaton as well. Can you please confirm ? From the performance numbers, it looks like we don't need to do optimization 2. We can just use dk.brics.automaton for the common regexes as well and keep the pig code simpler. PERFORMANCE: optimize common case in matches (PORegex) -- Key: PIG-965 URL: https://issues.apache.org/jira/browse/PIG-965 Project: Pig Issue Type: Improvement Components: impl Reporter: Thejas M Nair Assignee: Ankit Modi Some frequently seen use cases of 'matches' comparison operator have follow properties - 1. The rhs is a constant string . eg c1 matches 'abc%' 2. Regexes such that look for matching prefix , suffix etc are very common. eg - abc%', %abc, '%abc%' To optimize for these common cases , PORegex.java can be changed to - 1. Compile the pattern (rhs of matches) re-use it if the pattern string has not changed. 2. Use string comparisons for simple common regexes (in 2 above). The implementation of Hive like clause uses similar optimizations. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784879#action_12784879 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426641/PIG-922-p3_13.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/console This message is automatically generated. Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, PIG-922-p3_13.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-966) Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces
[ https://issues.apache.org/jira/browse/PIG-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784901#action_12784901 ] Dmitriy V. Ryaboy commented on PIG-966: --- Quick question: I don't remember if we've gone over this before -- why is the sortedness information considered part of the schema? Shouldn't it be part of the statistics? Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces --- Key: PIG-966 URL: https://issues.apache.org/jira/browse/PIG-966 Project: Pig Issue Type: Improvement Components: impl Reporter: Alan Gates Assignee: Alan Gates I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces significantly. See http://wiki.apache.org/pig/LoadStoreRedesignProposal for full details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-966) Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces
[ https://issues.apache.org/jira/browse/PIG-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784931#action_12784931 ] Alan Gates commented on PIG-966: You can make an argument for putting it in either place. I argue for putting it in for a couple of reasons: It is useful to a large number of potential optimizations. Unlike most other statistics, it can be used in correctness checks (eg the user asked for a merge join, is the data sorted on the join key?) The only downside I can see is that some systems that will understand column names and types won't necessarily understand sortedness (like json). But it's no harder for the loader to figure out sortedness for the schema than it is for the statistics. Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces --- Key: PIG-966 URL: https://issues.apache.org/jira/browse/PIG-966 Project: Pig Issue Type: Improvement Components: impl Reporter: Alan Gates Assignee: Alan Gates I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces significantly. See http://wiki.apache.org/pig/LoadStoreRedesignProposal for full details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1068: -- Attachment: PIG-1068.patch This patch fixed the problem by moving the unwrapping logic from demuxer to packager. COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple' --- Key: PIG-1068 URL: https://issues.apache.org/jira/browse/PIG-1068 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Vikram Oberoi Assignee: Richard Ding Fix For: 0.6.0 Attachments: cogroup-bug.pig, log, PIG-1068.patch The COGROUP in the following script fails in its map: {code} logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, command:chararray, comments:chararray); SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; -- Project login clients and count them by ID. login_info = FOREACH logins { GENERATE id as id, comments AS client; }; logins_grouped = GROUP login_info BY (id, client); count_logins_by_client = FOREACH logins_grouped { generate group.id AS id, group.client AS client, COUNT($1) AS count; } -- Get the first quit. all_quits_grouped = GROUP all_quits BY id;
[jira] Updated: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1068: -- Status: Patch Available (was: Open) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple' --- Key: PIG-1068 URL: https://issues.apache.org/jira/browse/PIG-1068 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Vikram Oberoi Assignee: Richard Ding Fix For: 0.6.0 Attachments: cogroup-bug.pig, log, PIG-1068.patch The COGROUP in the following script fails in its map: {code} logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, command:chararray, comments:chararray); SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; -- Project login clients and count them by ID. login_info = FOREACH logins { GENERATE id as id, comments AS client; }; logins_grouped = GROUP login_info BY (id, client); count_logins_by_client = FOREACH logins_grouped { generate group.id AS id, group.client AS client, COUNT($1) AS count; } -- Get the first quit. all_quits_grouped = GROUP all_quits BY id; quits = FOREACH
[jira] Updated: (PIG-1111) [Zebra] multiple outputs support
[ https://issues.apache.org/jira/browse/PIG-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-: - Affects Version/s: 0.7.0 0.6.0 Status: Patch Available (was: Open) Please review and provide feedback at your earliest convenience [Zebra] multiple outputs support Key: PIG- URL: https://issues.apache.org/jira/browse/PIG- Project: Pig Issue Type: New Feature Affects Versions: 0.6.0, 0.7.0 Reporter: Gaurav Jain Assignee: Gaurav Jain Fix For: 0.6.0, 0.7.0 Attachments: PIG-.patch Zebra enables application to stream data into different zebra table instances. New Interface added: setMultipleOutputs( JobConf jobconf, String commaSeparatedLocation, Class? extends ZebraOutputPartitioner theClass. Zebra maintains a list of tables instances based on commaseparatedlocations ( in that order ) ZebraOutputPartitioner interface has getOutputPartition method which is implemented by the application. It will return an index into the list. Zebra will write to that instance We also introduce a new mapred property for setting multiple outputs. mapred.lib.table.multi.output.dirs -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ying He updated PIG-1118: - Attachment: PIG_1118.patch bug fix. expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784985#action_12784985 ] Pradeep Kamath commented on PIG-922: I reviewed the changes to pass the load signature to the slicer/Slice and PigStorage for column pruning to work on the backend - the changes look good. The one change I wasn't clear about was the use of signature in order by since currently LOSort's alias is used as the signature and that would not be useful to the Slicer/slice or PigStorage in the backend since they would expect the LOLoad's alias. Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, PIG-922-p3_13.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1086) Nested sort by * throw exception
[ https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784992#action_12784992 ] Richard Ding commented on PIG-1086: --- The script works if the input has no schema: {code} A = load '1.txt'; B = group A by a0; C = foreach B { D = order A by *; generate group, D;}; explain C; {code} Nested sort by * throw exception Key: PIG-1086 URL: https://issues.apache.org/jira/browse/PIG-1086 Project: Pig Issue Type: Bug Affects Versions: 0.5.0 Reporter: Daniel Dai The following script fail: A = load '1.txt' as (a0, a1, a2); B = group A by a0; C = foreach B { D = order A by *; generate group, D;}; explain C; Here is the stack: Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.get(ArrayList.java:324) at org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752) at org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43) at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234) at org.apache.pig.PigServer.compilePp(PigServer.java:864) at org.apache.pig.PigServer.explain(PigServer.java:583) ... 8 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1111) [Zebra] multiple outputs support
[ https://issues.apache.org/jira/browse/PIG-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-: - Status: Open (was: Patch Available) Submitting an update [Zebra] multiple outputs support Key: PIG- URL: https://issues.apache.org/jira/browse/PIG- Project: Pig Issue Type: New Feature Affects Versions: 0.6.0, 0.7.0 Reporter: Gaurav Jain Assignee: Gaurav Jain Fix For: 0.6.0, 0.7.0 Attachments: PIG-.patch Zebra enables application to stream data into different zebra table instances. New Interface added: setMultipleOutputs( JobConf jobconf, String commaSeparatedLocation, Class? extends ZebraOutputPartitioner theClass. Zebra maintains a list of tables instances based on commaseparatedlocations ( in that order ) ZebraOutputPartitioner interface has getOutputPartition method which is implemented by the application. It will return an index into the list. Zebra will write to that instance We also introduce a new mapred property for setting multiple outputs. mapred.lib.table.multi.output.dirs -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785025#action_12785025 ] Olga Natkovich commented on PIG-1118: - Ying, the change looks good. Please, add a unit test for this bug. expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1118: Status: Open (was: Patch Available) expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785029#action_12785029 ] Hadoop QA commented on PIG-1068: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426691/PIG-1068.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/console This message is automatically generated. COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple' --- Key: PIG-1068 URL: https://issues.apache.org/jira/browse/PIG-1068 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Vikram Oberoi Assignee: Richard Ding Fix For: 0.6.0 Attachments: cogroup-bug.pig, log, PIG-1068.patch The COGROUP in the following script fails in its map: {code} logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, command:chararray, comments:chararray); SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; -- Project login clients and count them by ID. login_info = FOREACH logins { GENERATE id as id, comments AS client; }; logins_grouped = GROUP login_info BY (id, client); count_logins_by_client = FOREACH logins_grouped { generate group.id AS id, group.client AS client, COUNT($1) AS count;
[jira] Commented: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785043#action_12785043 ] Ying He commented on PIG-1118: -- Olga, thank for review. A unit test is in the patch, TestAccumulator. expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1120) [zebra] should support using org.apache.hadoop.zebra.pig.TableStorer() if user does not want to specify storage hint
[zebra] should support using org.apache.hadoop.zebra.pig.TableStorer() if user does not want to specify storage hint - Key: PIG-1120 URL: https://issues.apache.org/jira/browse/PIG-1120 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 If user doesn't want to specify storage hint, current zebra implementation only support using org.apache.hadoop.zebra.pig.TableStorer('') Note: empty string in TableStorer(' '). We should support the format of using org.apache.hadoop.zebra.pig.TableStorer() as we do on using org.apache.hadoop.zebra.pig.TableLoader() sample pig script: register /grid/0/dev/hadoopqa/jars/zebra.jar; a = load '1.txt' as (a:int, b:float,c:long,d:double,e:chararray,f:bytearray,r1(f1:chararray,f2:chararray),m1:map[]); b = load '2.txt' as (a:int, b:float,c:long,d:double,e:chararray,f:bytearray,r1(f1:chararray,f2:chararray),m1:map[]); c = join a by a, b by a; d = foreach c generate a::a, a::b, b::c; describe d; dump d; store d into 'join3' using org.apache.hadoop.zebra.pig.TableStorer(''); --this will fail --store d into 'join3' using org.apache.hadoop.zebra.pig.TableStorer( ); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1118: Status: Patch Available (was: Open) Thanks, I am not sure how I missed it :) expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1121) [zebre] zebra user forces pig script to have 'as xxx' in foreach statement in order to be able to store successfully
[zebre] zebra user forces pig script to have 'as xxx' in foreach statement in order to be able to store successfully Key: PIG-1121 URL: https://issues.apache.org/jira/browse/PIG-1121 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 In the following pig script, if user do b = foreach a generate m1#'a' ; describe b will be: b: {bytearray} zebra store will fail, since there is no name passed to zebra, and zebra not only need type but also name in order to store. = If user do b = foreach a generate m1#'a' as ms1; describe b will be: b: {ms1: bytearray} Then zebra store can be succeeded. = Here is the full pig script. register /grid/0/dev/hadoopqa/jars/zebra.jar; a = load '1.txt' as (a:int, b:float,c:long,d:double,e:chararray,f:bytearray,r1(f1:chararray,f2:chararray),m1:map[]); b = foreach a generate m1#'a' as ms1; describe b; store b into 'map1' using org.apache.hadoop.zebra.pig.TableStorer(''); So, we should either fix it or document it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1122: -- Attachment: PIG-1122.patch [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1122: -- Attachment: (was: PIG-1122.patch) [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1114) MultiQuery optimization throws error when merging 2 level splits
[ https://issues.apache.org/jira/browse/PIG-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785115#action_12785115 ] Olga Natkovich commented on PIG-1114: - +1 on the changes. will be committing now to trunk and 0.6 branch. MultiQuery optimization throws error when merging 2 level splits Key: PIG-1114 URL: https://issues.apache.org/jira/browse/PIG-1114 Project: Pig Issue Type: Bug Reporter: Ankur Assignee: Richard Ding Priority: Critical Fix For: 0.6.0 Attachments: PIG-1114.patch, Pig_1114_Client.log Multi-query optimization throws an error when merging 2 level splits. Following is the script to reproduce the error data = LOAD 'data' USING PigStorage() AS (id:int, name:chararray); ids = FOREACH data GENERATE id; allId = GROUP ids all; allIdCount = FOREACH allId GENERATE group as allId, COUNT(ids) as total; idGroup = GROUP ids by id; idGroupCount = FOREACH idGroup GENERATE group as id, COUNT(ids) as count; countTotal = cross idGroupCount, allIdCount; idCountTotal = foreach countTotal generate id, count, total, (double)count / (double)total as proportion; orderedCounts = order idCountTotal by count desc; STORE orderedCounts INTO 'mq_problem/ids'; names = FOREACH data GENERATE name; allNames = GROUP names all; allNamesCount = FOREACH allNames GENERATE group as namesAll, COUNT(names) as total; nameGroup = GROUP names by name; nameGroupCount = FOREACH nameGroup GENERATE group as name, COUNT(names) as count; namesCrossed = cross nameGroupCount, allNamesCount; nameCountTotal = foreach namesCrossed generate name, count, total, (double)count / (double)total as proportion; nameCountsOrdered = order nameCountTotal by count desc; STORE nameCountsOrdered INTO 'mq_problem/names'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1122: -- Attachment: PIG-1122.patch [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785118#action_12785118 ] Yan Zhou commented on PIG-1122: --- Note that the patch should be applied to trunk. Also note that there is no test case for this trivial versioning change so any Hundson grievance in that regard should be ignored. [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1122: -- Status: Open (was: Patch Available) [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1122: -- Status: Patch Available (was: Open) [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785123#action_12785123 ] Yan Zhou commented on PIG-1122: --- This has not caused any problems since the CLASSPATH also contains the path pointing to the dir holding the PIG classes directly. But still, it's not perfect and could cause nasty headaches if someone has a leftover 0.6 Pig jar when build zebra. [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785127#action_12785127 ] Chao Wang commented on PIG-1122: +1 [zebra] Zebra build.xml still uses 0.6 version -- Key: PIG-1122 URL: https://issues.apache.org/jira/browse/PIG-1122 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1122.patch Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1116) Remove redundant map-reduce job for merge join
[ https://issues.apache.org/jira/browse/PIG-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785134#action_12785134 ] Olga Natkovich commented on PIG-1116: - +1 Remove redundant map-reduce job for merge join -- Key: PIG-1116 URL: https://issues.apache.org/jira/browse/PIG-1116 Project: Pig Issue Type: Bug Reporter: Daniel Dai Assignee: Pradeep Kamath Fix For: 0.6.0 Attachments: PIG-1116.patch In merge join, when we convert right hand side file into a side file, we didn't remove it from the map-reduce plan, we only disconnect it from the plan. When we run the query, the redundant load will load the data but doing nothing. This operation should be removed entirely. Eg: a = load '/user/pig/tests/data/zebra/singlefile/studentsortedtab10k' using org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, gpa); b = load '/user/pig/tests/data/zebra/singlefile/votersortedtab10k' using org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, registration, contributions); c = join a by name, b by name using merge; explain c; {code} #-- # Map Reduce Plan #-- MapReduce node 1-21 Map Plan Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/votersortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) - 1-13 Global sort: false MapReduce node 1-20 Map Plan Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-19 | |---MergeJoin[tuple] - 1-16 | |---Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/studentsortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) - 1-12 Global sort: false {code} 1-21 should be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Attachment: PIG-922-p3_14.patch Address the review comments from Pradeep. Actually we do not need to do anything special for order by. We only prune columns on the upfront LOLoad in the logical plan. Order by will read intermediate input file (In the case if we do not have input schema, order by will read user input file directly, however, prune columns only kick in when user give an input schema, so it is not the case), nothing will be pruned. Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, PIG-922-p3_13.patch, PIG-922-p3_14.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Status: Open (was: Patch Available) Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, PIG-922-p3_13.patch, PIG-922-p3_14.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Status: Patch Available (was: Open) Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, PIG-922-p3_13.patch, PIG-922-p3_14.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785166#action_12785166 ] Hadoop QA commented on PIG-1118: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426698/PIG_1118.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/console This message is automatically generated. expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.7.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.