[jira] Commented: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878545#action_12878545 ] Hadoop QA commented on PIG-972: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446735/NestedDescribeFinale.patch against trunk revision 953798. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 384 release audit warnings (more than the trunk's current 383 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/324/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/324/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/324/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/324/console This message is automatically generated. Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-972: --- Attachment: NestedDescribeFinale1.patch Findbug warning fixed Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-972: --- Status: Open (was: Patch Available) Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-972: --- Status: Patch Available (was: Open) Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression
RegExLoader hangs on lines that don't match the regular expression -- Key: PIG-1449 URL: https://issues.apache.org/jira/browse/PIG-1449 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Justin Sanders Priority: Minor In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with Task attempt_X failed to report status for 600 seconds. Killing!. Here are the steps to recreate the bug: Create a text file in HDFS with the following lines: test1 testA test2 Run the following pig script: REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); dump test; Expected result: (test1) (test3) Actual result: Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression
[ https://issues.apache.org/jira/browse/PIG-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Sanders updated PIG-1449: Status: Patch Available (was: Open) Release Note: Fixed hanging in RegExLoader if line didn't match regular expression. RegExLoader hangs on lines that don't match the regular expression -- Key: PIG-1449 URL: https://issues.apache.org/jira/browse/PIG-1449 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Justin Sanders Priority: Minor In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with Task attempt_X failed to report status for 600 seconds. Killing!. Here are the steps to recreate the bug: Create a text file in HDFS with the following lines: test1 testA test2 Run the following pig script: REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); dump test; Expected result: (test1) (test3) Actual result: Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression
[ https://issues.apache.org/jira/browse/PIG-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Sanders updated PIG-1449: Attachment: RegExLoader.patch RegExLoader hangs on lines that don't match the regular expression -- Key: PIG-1449 URL: https://issues.apache.org/jira/browse/PIG-1449 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Justin Sanders Priority: Minor Attachments: RegExLoader.patch In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with Task attempt_X failed to report status for 600 seconds. Killing!. Here are the steps to recreate the bug: Create a text file in HDFS with the following lines: test1 testA test2 Run the following pig script: REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); dump test; Expected result: (test1) (test3) Actual result: Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression
[ https://issues.apache.org/jira/browse/PIG-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878663#action_12878663 ] Ashutosh Chauhan commented on PIG-1449: --- Justin, Good catch. Can you assimilate your test case in junit in one of piggybank/test/storage/TestRegExLoader or TestMyRegExLoader. That way we'll have a regression test for the issue. RegExLoader hangs on lines that don't match the regular expression -- Key: PIG-1449 URL: https://issues.apache.org/jira/browse/PIG-1449 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Justin Sanders Priority: Minor Attachments: RegExLoader.patch In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with Task attempt_X failed to report status for 600 seconds. Killing!. Here are the steps to recreate the bug: Create a text file in HDFS with the following lines: test1 testA test2 Run the following pig script: REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); dump test; Expected result: (test1) (test3) Actual result: Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1428) Add getPigStatusReporter() to PigHadoopLogger
[ https://issues.apache.org/jira/browse/PIG-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878674#action_12878674 ] Richard Ding commented on PIG-1428: --- +1 Add getPigStatusReporter() to PigHadoopLogger - Key: PIG-1428 URL: https://issues.apache.org/jira/browse/PIG-1428 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Dmitriy V. Ryaboy Fix For: 0.8.0 Attachments: PIG-1428.patch, PIG-1428.patch, PIG-1428.patch Without this getter method, its not possible to get counters, report progress etc. from UDFs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1333: -- Attachment: PIG-1333_1.patch API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1333: -- Status: Patch Available (was: Open) API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1333: -- Status: Open (was: Patch Available) API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1428) Make a StatusReporter singleton available for incrementing counters
[ https://issues.apache.org/jira/browse/PIG-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-1428: --- Summary: Make a StatusReporter singleton available for incrementing counters (was: Add getPigStatusReporter() to PigHadoopLogger) Patch Info: [Patch Available] Make a StatusReporter singleton available for incrementing counters --- Key: PIG-1428 URL: https://issues.apache.org/jira/browse/PIG-1428 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Ashutosh Chauhan Assignee: Dmitriy V. Ryaboy Fix For: 0.8.0 Attachments: PIG-1428.patch, PIG-1428.patch, PIG-1428.patch Without this getter method, its not possible to get counters, report progress etc. from UDFs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1302) Include zebra's pigtest ant target as a part of pig's ant test target
[ https://issues.apache.org/jira/browse/PIG-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878702#action_12878702 ] Hadoop QA commented on PIG-1302: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446596/PIG-1302.patch against trunk revision 953798. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/326/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/326/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/326/console This message is automatically generated. Include zebra's pigtest ant target as a part of pig's ant test target --- Key: PIG-1302 URL: https://issues.apache.org/jira/browse/PIG-1302 Project: Pig Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Giridharan Kesavan Attachments: PIG-1302.patch There are changes made in Pig interfaces which break zebra loaders/storers. It would be good to run the pig tests in the zebra unit tests as part of running pig's core-test for each patch submission. So essentially in the test ant target in pig, we would need to invoke zebra's pigtest target. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878732#action_12878732 ] Dmitriy V. Ryaboy commented on PIG-1333: bq. I'm not sure we should make all Hadoop counters available through the new API. How useful will it be to the users? I'm open to suggestions. Can't speak for other users, but we use counters quite a bit with Elephant Bird and some internal code for keeping track of timed out service requests, unparsable records, and more. The @MonitoredUDF annotation I proposed in PIG-1427 uses counters to report on runaway udfs that get killed. I think the question isn't so much why would you expose them, as why wouldn't you expose them... API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878776#action_12878776 ] Richard Ding commented on PIG-1333: --- Hi Dmitriy, Let me try to clarify the requirement: what you are asking for is to have the new API expose the Hadoop counters, such as {code} public class JobStats { public Counters getCounters(); .. } {code} so users can get the counter they are interested in. API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878810#action_12878810 ] Hadoop QA commented on PIG-972: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447041/NestedDescribeFinale1.patch against trunk revision 953798. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 384 release audit warnings (more than the trunk's current 383 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/327/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/327/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/327/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/327/console This message is automatically generated. Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression
[ https://issues.apache.org/jira/browse/PIG-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878812#action_12878812 ] Hadoop QA commented on PIG-1449: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447045/RegExLoader.patch against trunk revision 953798. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/328/console This message is automatically generated. RegExLoader hangs on lines that don't match the regular expression -- Key: PIG-1449 URL: https://issues.apache.org/jira/browse/PIG-1449 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Justin Sanders Priority: Minor Attachments: RegExLoader.patch In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with Task attempt_X failed to report status for 600 seconds. Killing!. Here are the steps to recreate the bug: Create a text file in HDFS with the following lines: test1 testA test2 Run the following pig script: REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); dump test; Expected result: (test1) (test3) Actual result: Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878814#action_12878814 ] Daniel Dai commented on PIG-972: Release audit warning is due to an added jdiff file which we cannot do anything. Patch committed. With this patch, we can describe nested alias using the syntax: {code} describe c::d {code} If d is used multiple times in nested foreach, only the last occurrence will be considered. Thanks Aniket! Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-972: --- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeFinale1.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1333) API interface to Pig
[ https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12878826#action_12878826 ] Dmitriy V. Ryaboy commented on PIG-1333: Yup. API interface to Pig Key: PIG-1333 URL: https://issues.apache.org/jira/browse/PIG-1333 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1333.patch, PIG-1333_1.patch It would be nice to make Pig more friendly for applications like workflow that would be executing pig scripts on user behalf. Currently, they would have to use pig command line to execute the code; however, this has limitation on the kind of output that would be delivered. For instance, it is hard to produce error information that is easy to use programatically or collect statistics. The proposal is to create a class that mimics the behavior of the Main but gives users a status object back. The the main code of pig would look somethig like: public static void main(String args[]) { PigStatus ps = PigMain.exec(args); exit (PigStatus.rc); } We need to define the following: - Content of PigStatus. It should at least include * return code * error string * exception * statistics - A way to propagate the status class through pig code -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.