[jira] Commented: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
[ https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876980#action_12876980 ] Hadoop QA commented on PIG-1438: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446652/PIG-1438_1.patch against trunk revision 952098. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/334/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/334/console This message is automatically generated. [Performance] MultiQueryOptimizer should also merge DISTINCT jobs - Key: PIG-1438 URL: https://issues.apache.org/jira/browse/PIG-1438 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Richard Ding Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1438.patch, PIG-1438_1.patch Current implementation doesn't merge jobs derived from DISTINCT statements. The reason is that DISTINCT jobs are implemented using a special combiner (DistinctCombiner). But we should be able to merge jobs that have the same type of combiner (e.g. merge multiple DISTINCT jobs into one). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1443) DefaultTuple underestimate the memory footprint for string
DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877143#action_12877143 ] Daniel Dai commented on PIG-1443: - Reference: http://www.javamex.com/tutorials/memory/string_memory_usage.shtml DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1444) [Zebra] Zebra build should have a test-smoke target
[Zebra] Zebra build should have a test-smoke target --- Key: PIG-1444 URL: https://issues.apache.org/jira/browse/PIG-1444 Project: Pig Issue Type: Task Components: build Affects Versions: 0.8.0 Reporter: Gaurav Jain Priority: Minor Fix For: 0.8.0 Zebra build should have a test-smoke target that should atleast use minicluster for its test-cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Status: Patch Available (was: Open) DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Attachment: PIG-1443-1.patch DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
[ https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877150#action_12877150 ] Ashutosh Chauhan commented on PIG-1438: --- +1 please commit. [Performance] MultiQueryOptimizer should also merge DISTINCT jobs - Key: PIG-1438 URL: https://issues.apache.org/jira/browse/PIG-1438 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Richard Ding Assignee: Richard Ding Fix For: 0.8.0 Attachments: PIG-1438.patch, PIG-1438_1.patch Current implementation doesn't merge jobs derived from DISTINCT statements. The reason is that DISTINCT jobs are implemented using a special combiner (DistinctCombiner). But we should be able to merge jobs that have the same type of combiner (e.g. merge multiple DISTINCT jobs into one). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1445) Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented
Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented -- Key: PIG-1445 URL: https://issues.apache.org/jira/browse/PIG-1445 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 The following script fail due to ERROR 2013: Moving LOLimit in front of LOStream is not implemented. {code} A = LOAD 'data'; B = STREAM A THROUGH `stream.pl`; C = LIMIT B 10; explain C; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-928: --- Attachment: RegisterPythonUDF2.patch UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: calltrace.png, package.zip, pig-greek.tgz, pig.scripting.patch.arnab, pyg.tgz, RegisterPythonUDF2.patch, scripting.tgz, scripting.tgz, test.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-928: --- Attachment: RegisterScriptUDFDefineParse.patch UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: calltrace.png, package.zip, pig-greek.tgz, pig.scripting.patch.arnab, pyg.tgz, RegisterPythonUDF2.patch, RegisterScriptUDFDefineParse.patch, scripting.tgz, scripting.tgz, test.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1445) Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented
[ https://issues.apache.org/jira/browse/PIG-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1445: Attachment: PIG-1445-1.patch We should not push LOLimit in front of LOStream. Attach patch. Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented -- Key: PIG-1445 URL: https://issues.apache.org/jira/browse/PIG-1445 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1445-1.patch The following script fail due to ERROR 2013: Moving LOLimit in front of LOStream is not implemented. {code} A = LOAD 'data'; B = STREAM A THROUGH `stream.pl`; C = LIMIT B 10; explain C; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1445) Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented
[ https://issues.apache.org/jira/browse/PIG-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1445: Status: Patch Available (was: Open) Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented -- Key: PIG-1445 URL: https://issues.apache.org/jira/browse/PIG-1445 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1445-1.patch The following script fail due to ERROR 2013: Moving LOLimit in front of LOStream is not implemented. {code} A = LOAD 'data'; B = STREAM A THROUGH `stream.pl`; C = LIMIT B 10; explain C; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1441) New test targets: unit and smoke
[ https://issues.apache.org/jira/browse/PIG-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-1441. - Resolution: Fixed patch committed to trunk and 0.6 and 0.7 branches. Thanks Daniel for review. New test targets: unit and smoke Key: PIG-1441 URL: https://issues.apache.org/jira/browse/PIG-1441 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.8.0 Attachments: PIG-1441.patch, PIG-1441_2.patch As we get more and more tests, adding more structure would help us to minimize time spent on testing. Here are 2 new targets I propose we add. (Hadoop has the same targets for the same purposes). unit - to run all true unit tests (those that trully testing apis and internal functionality and not running e2e tests through junit. This test should run relatively quick 10-15 minutes and if we are good at adding unit tests will give good covergae. smoke - this would be a set of a few e2e tests that provide good overall coverage within about 30 minutes. I would say that for simple patche, we would still require only commit tests while for more involved patches, the developers should run both unit and smoke before submitting the patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877197#action_12877197 ] Arnab Nandi commented on PIG-928: - register 'test.py' lang python; How does one define an arbitrary lang? e.g. I would like to introduce Scala as a UDF engine, preferably as a jar itself. i.e. something like: register scalascript.jar; register 'test.py' USING scala.Engine(); UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: calltrace.png, package.zip, pig-greek.tgz, pig.scripting.patch.arnab, pyg.tgz, RegisterPythonUDF2.patch, RegisterScriptUDFDefineParse.patch, scripting.tgz, scripting.tgz, test.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1444) [Zebra] Zebra build should have a test-smoke target
[ https://issues.apache.org/jira/browse/PIG-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-1444: - Attachment: PIG-1444.patch patch 1 [Zebra] Zebra build should have a test-smoke target --- Key: PIG-1444 URL: https://issues.apache.org/jira/browse/PIG-1444 Project: Pig Issue Type: Task Components: build Affects Versions: 0.8.0 Reporter: Gaurav Jain Priority: Minor Fix For: 0.8.0 Attachments: PIG-1444.patch Zebra build should have a test-smoke target that should atleast use minicluster for its test-cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1444) [Zebra] Zebra build should have a test-smoke target
[ https://issues.apache.org/jira/browse/PIG-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1444: -- Status: Patch Available (was: Open) [Zebra] Zebra build should have a test-smoke target --- Key: PIG-1444 URL: https://issues.apache.org/jira/browse/PIG-1444 Project: Pig Issue Type: Task Components: build Affects Versions: 0.8.0 Reporter: Gaurav Jain Priority: Minor Fix For: 0.8.0 Attachments: PIG-1444.patch Zebra build should have a test-smoke target that should atleast use minicluster for its test-cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1302) Include zebra's pigtest ant target as a part of pig's ant test target
[ https://issues.apache.org/jira/browse/PIG-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877229#action_12877229 ] Pradeep Kamath commented on PIG-1302: - +1 Include zebra's pigtest ant target as a part of pig's ant test target --- Key: PIG-1302 URL: https://issues.apache.org/jira/browse/PIG-1302 Project: Pig Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Pradeep Kamath Assignee: Giridharan Kesavan Attachments: PIG-1302.patch There are changes made in Pig interfaces which break zebra loaders/storers. It would be good to run the pig tests in the zebra unit tests as part of running pig's core-test for each patch submission. So essentially in the test ant target in pig, we would need to invoke zebra's pigtest target. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877256#action_12877256 ] Hadoop QA commented on PIG-1443: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446712/PIG-1443-1.patch against trunk revision 952098. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 139 javac compiler warnings (more than the trunk's current 138 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/321/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/321/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/321/console This message is automatically generated. DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Attachment: PIG-1443-2.patch Deal with javac warning. DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch, PIG-1443-2.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Status: Open (was: Patch Available) DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch, PIG-1443-2.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Status: Patch Available (was: Open) DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1443-1.patch, PIG-1443-2.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1443) DefaultTuple underestimate the memory footprint for string
[ https://issues.apache.org/jira/browse/PIG-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1443: Fix Version/s: 0.7.0 (was: 0.8.0) DefaultTuple underestimate the memory footprint for string -- Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1443-1.patch, PIG-1443-2.patch Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-972: --- Attachment: NestedDescribeFinale.patch Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877293#action_12877293 ] Aniket Mokashi commented on PIG-972: Submitted patch with above changes. Also added test cases to test different scenarios. {code} grunt describe c: c::d: {a0: int,a1: int} {code} It does not print any nested aliases. For printing nested aliases, we have describe c::d; Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-972) Make describe work with nested foreach
[ https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-972: --- Status: Patch Available (was: Open) Make describe work with nested foreach -- Key: PIG-972 URL: https://issues.apache.org/jira/browse/PIG-972 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Aniket Mokashi Fix For: 0.8.0 Attachments: NestedDescribeFinale.patch, NestedDescribeProp1.patch, NestedDescribeProp2Initial.patch Currently Parser can't deal with that. This is because describe is part of Grunt parser while the rest of nested foreach is handled by the QueryParser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1445) Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented
[ https://issues.apache.org/jira/browse/PIG-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877318#action_12877318 ] Hadoop QA commented on PIG-1445: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446718/PIG-1445-1.patch against trunk revision 953109. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 383 release audit warnings (more than the trunk's current 382 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/322/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/322/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/322/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/322/console This message is automatically generated. Pig error: ERROR 2013: Moving LOLimit in front of LOStream is not implemented -- Key: PIG-1445 URL: https://issues.apache.org/jira/browse/PIG-1445 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1445-1.patch The following script fail due to ERROR 2013: Moving LOLimit in front of LOStream is not implemented. {code} A = LOAD 'data'; B = STREAM A THROUGH `stream.pl`; C = LIMIT B 10; explain C; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.