[jira] Updated: (PIG-532) Casting a field removes its alias.
[ https://issues.apache.org/jira/browse/PIG-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-532: -- Attachment: 532.patch Patch QueryParser.jjt - Passing alias information to LOCast TestLogicalPlanBuilder.java - Added test cases for this patch. Corrected other test cases that were passing multiple statements at a time to buildPlan(). Casting a field removes its alias. -- Key: PIG-532 URL: https://issues.apache.org/jira/browse/PIG-532 Project: Pig Issue Type: Bug Components: impl Affects Versions: types_branch Reporter: Alan Gates Assignee: Thejas M Nair Priority: Minor Fix For: types_branch Attachments: 532.patch Given a script like: {code} a = loader 'myfile' as (x, y); b = foreach a generate (int)x, (double)y; c = group a by x; {code} you will get an error that x is an unknown alias. The cast operator is not carrying through the alias. It should. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-663) parameter substitution: fails without error if substitution value is not found
[ https://issues.apache.org/jira/browse/PIG-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-663. Resolution: Duplicate This is duplicate of PIG-680 parameter substitution: fails without error if substitution value is not found -- Key: PIG-663 URL: https://issues.apache.org/jira/browse/PIG-663 Project: Pig Issue Type: Bug Reporter: Olga Natkovich When the preprocessor does not find the value to substitute for a variable the .substituted file that is generated is empty and then when it goes through pig2 no errors are reported. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-650) pig should look for and use the pig specific 'pig-cluster-hadoop-site.xml' in the non HOD case just like it does in the HOD case
[ https://issues.apache.org/jira/browse/PIG-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich reassigned PIG-650: -- Assignee: Santhosh Srinivasan pig should look for and use the pig specific 'pig-cluster-hadoop-site.xml' in the non HOD case just like it does in the HOD case Key: PIG-650 URL: https://issues.apache.org/jira/browse/PIG-650 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Pradeep Kamath Assignee: Santhosh Srinivasan Fix For: types_branch Currently users can create a pig-cluster-hadoop-site.xml with pig specific overrides for hadoop properties for use on the cluster. This file is searched for in the classpath and used in the HOD case but not in the non HOD case. We should also do the same in the non HOD case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-641) Fragment replicate join does not work in local mode
[ https://issues.apache.org/jira/browse/PIG-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12679379#action_12679379 ] Olga Natkovich commented on PIG-641: Shubham, could you, please, reply to Alan's question, thanks. Fragment replicate join does not work in local mode --- Key: PIG-641 URL: https://issues.apache.org/jira/browse/PIG-641 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Shubham Chopra Attachments: 641.patch, 641.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-544) Utf8StorageConverter.java does not always produce NULLs when data is malformed
[ https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-544: -- Attachment: 544.patch Santhosh, Thanks for reviewing the changes and the suggestions. The suggestions have been incorporated in this new patch, which replaces the old one. Re: Index: src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java, I have removed the changes in that as those are not necessary for fixing this JIRA. (I had accidentally included it in the patch.). It will be useful to fix similar problem described in PIG-696 . Re: The semantics of bytearray to numeric types, a best effort conversion will be done. ie if there are minimum number of bytes required for the conversion, it will convert the bytearray to the specified number type. Utf8StorageConverter.java does not always produce NULLs when data is malformed -- Key: PIG-544 URL: https://issues.apache.org/jira/browse/PIG-544 Project: Pig Issue Type: Bug Components: impl Affects Versions: types_branch Reporter: Olga Natkovich Assignee: Thejas M Nair Fix For: types_branch Attachments: 544.patch, PIG-544.txt It does so for scalar types but not for complext types and not for the fields inside of the complext types. This is because it uses different code to parse scalar types by themselves and scalar types inside of a complex type. It should really use the same (its own code to do so.) The code it is currently uses, is inside of TextDataParser.jjt and is also used to parse constants so we need to be careful if we want to make changes to it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-544) Utf8StorageConverter.java does not always produce NULLs when data is malformed
[ https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-544: -- Attachment: (was: PIG-544.txt) Utf8StorageConverter.java does not always produce NULLs when data is malformed -- Key: PIG-544 URL: https://issues.apache.org/jira/browse/PIG-544 Project: Pig Issue Type: Bug Components: impl Affects Versions: types_branch Reporter: Olga Natkovich Assignee: Thejas M Nair Fix For: types_branch Attachments: 544.patch It does so for scalar types but not for complext types and not for the fields inside of the complext types. This is because it uses different code to parse scalar types by themselves and scalar types inside of a complex type. It should really use the same (its own code to do so.) The code it is currently uses, is inside of TextDataParser.jjt and is also used to parse constants so we need to be careful if we want to make changes to it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-698) Simple join fails on variable-length records
Simple join fails on variable-length records Key: PIG-698 URL: https://issues.apache.org/jira/browse/PIG-698 Project: Pig Issue Type: Bug Components: impl Environment: Yahoo! clusters. Reporter: Peter Arthur Ciccolo Joins can fail with an out-of-bounds access to fields that are not referenced in the script when variable-length records are involved. Example by Ben Reed: i1: 1 c D E 1 a B i2: 0 0 Q 1 x z 1 a b c i1 = load 'i1'; i2 = load 'i2'; j = join i1 by $0, i2 by $0; dump j -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-627) PERFORMANCE: multi-query optimization
[ https://issues.apache.org/jira/browse/PIG-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated PIG-627: --- Attachment: file_cmds-0305.patch This patch is for the multi query branch again. It mostly fixes the problem with certain commands in the script that require immediate execution (in batch mode). So if you do stuff like: ... store a into 'tmp_foo'; ... rm tmp_foo ... The rm will trigger execution and the file will be there for you to delete, copyToLocal, move, etc. You can also use the exec statement without params in a script now, to force execution of what we've seen so far. This patch also contains a minor fix with the computation of progress in MR jobs (which I screwed up in the last patch). PERFORMANCE: multi-query optimization - Key: PIG-627 URL: https://issues.apache.org/jira/browse/PIG-627 Project: Pig Issue Type: Improvement Affects Versions: types_branch Reporter: Olga Natkovich Fix For: types_branch Attachments: file_cmds-0305.patch, multi-store-0303.patch, multi-store-0304.patch, multiquery_0223.patch, multiquery_0224.patch Currently, if your Pig script contains multiple stores and some shared computation, Pig will execute several independent queries. For instance: A = load 'data' as (a, b, c); B = filter A by a 5; store B into 'output1'; C = group B by b; store C into 'output2'; This script will result in map-only job that generated output1 followed by a map-reduce job that generated output2. As the resuld data is read, parsed and filetered twice which is unnecessary and costly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-627) PERFORMANCE: multi-query optimization
[ https://issues.apache.org/jira/browse/PIG-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12679500#action_12679500 ] Gunther Hagleitner commented on PIG-627: Oh. I also took out the restriction of the openIterator in batch mode. That was no longer needed. PERFORMANCE: multi-query optimization - Key: PIG-627 URL: https://issues.apache.org/jira/browse/PIG-627 Project: Pig Issue Type: Improvement Affects Versions: types_branch Reporter: Olga Natkovich Fix For: types_branch Attachments: file_cmds-0305.patch, multi-store-0303.patch, multi-store-0304.patch, multiquery_0223.patch, multiquery_0224.patch Currently, if your Pig script contains multiple stores and some shared computation, Pig will execute several independent queries. For instance: A = load 'data' as (a, b, c); B = filter A by a 5; store B into 'output1'; C = group B by b; store C into 'output2'; This script will result in map-only job that generated output1 followed by a map-reduce job that generated output2. As the resuld data is read, parsed and filetered twice which is unnecessary and costly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.