[jira] Updated: (PIG-532) Casting a field removes its alias.

2009-03-05 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-532:
--

Attachment: 532.patch

Patch
QueryParser.jjt - Passing alias information to LOCast
TestLogicalPlanBuilder.java - Added test cases for this patch. Corrected other 
test cases that were passing multiple statements at a time to buildPlan().


 Casting a field removes its alias.
 --

 Key: PIG-532
 URL: https://issues.apache.org/jira/browse/PIG-532
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: types_branch
Reporter: Alan Gates
Assignee: Thejas M Nair
Priority: Minor
 Fix For: types_branch

 Attachments: 532.patch


 Given a script like:
 {code}
 a = loader 'myfile' as (x, y);
 b = foreach a generate (int)x, (double)y;
 c = group a by x;
 {code}
 you will get an error that x is an unknown alias.  The cast operator is not 
 carrying through the alias.  It should.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-663) parameter substitution: fails without error if substitution value is not found

2009-03-05 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-663.


Resolution: Duplicate

This is duplicate of PIG-680

 parameter substitution: fails without error if substitution value is not found
 --

 Key: PIG-663
 URL: https://issues.apache.org/jira/browse/PIG-663
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich

 When the preprocessor does not find the value to substitute for a variable 
 the .substituted file that is generated is empty and then when it goes 
 through pig2 no errors are reported. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-650) pig should look for and use the pig specific 'pig-cluster-hadoop-site.xml' in the non HOD case just like it does in the HOD case

2009-03-05 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-650:
--

Assignee: Santhosh Srinivasan

 pig should look for and use the pig specific 'pig-cluster-hadoop-site.xml' in 
 the non HOD case just like it does in the HOD case
 

 Key: PIG-650
 URL: https://issues.apache.org/jira/browse/PIG-650
 Project: Pig
  Issue Type: Bug
Affects Versions: types_branch
Reporter: Pradeep Kamath
Assignee: Santhosh Srinivasan
 Fix For: types_branch


 Currently users can create a pig-cluster-hadoop-site.xml with pig specific 
 overrides for hadoop properties for use on the cluster. This file is searched 
 for in the classpath and used in the HOD case but not in the non HOD case. We 
 should also do the same in the non HOD case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-641) Fragment replicate join does not work in local mode

2009-03-05 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12679379#action_12679379
 ] 

Olga Natkovich commented on PIG-641:


Shubham, could you, please, reply to Alan's question, thanks.

 Fragment replicate join does not work in local mode
 ---

 Key: PIG-641
 URL: https://issues.apache.org/jira/browse/PIG-641
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Shubham Chopra
 Attachments: 641.patch, 641.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-544) Utf8StorageConverter.java does not always produce NULLs when data is malformed

2009-03-05 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-544:
--

Attachment: 544.patch

Santhosh, Thanks for reviewing the changes and the suggestions.
The suggestions have been incorporated in this new patch, which replaces the 
old one.
Re: Index: 
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java,
  I have removed the changes in that as those are not necessary for fixing this 
JIRA. (I had accidentally included it in the patch.). It will be useful to fix 
similar problem described in PIG-696 .

Re: The semantics of bytearray to numeric types, a best effort conversion will 
be done. ie if there are minimum number of bytes required for the conversion, 
it will convert the bytearray to the specified number type.




 Utf8StorageConverter.java does not always produce NULLs when data is malformed
 --

 Key: PIG-544
 URL: https://issues.apache.org/jira/browse/PIG-544
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: types_branch
Reporter: Olga Natkovich
Assignee: Thejas M Nair
 Fix For: types_branch

 Attachments: 544.patch, PIG-544.txt


 It does so for scalar types but not for complext types and not for the fields 
 inside of the complext types.
 This is because it uses different code to parse scalar types by themselves 
 and scalar types inside of a complex type. It should really use the same (its 
 own code to do so.)
 The code it is currently uses, is inside of TextDataParser.jjt and is also 
 used to parse constants so we need to be careful if we want to make changes 
 to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-544) Utf8StorageConverter.java does not always produce NULLs when data is malformed

2009-03-05 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-544:
--

Attachment: (was: PIG-544.txt)

 Utf8StorageConverter.java does not always produce NULLs when data is malformed
 --

 Key: PIG-544
 URL: https://issues.apache.org/jira/browse/PIG-544
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: types_branch
Reporter: Olga Natkovich
Assignee: Thejas M Nair
 Fix For: types_branch

 Attachments: 544.patch


 It does so for scalar types but not for complext types and not for the fields 
 inside of the complext types.
 This is because it uses different code to parse scalar types by themselves 
 and scalar types inside of a complex type. It should really use the same (its 
 own code to do so.)
 The code it is currently uses, is inside of TextDataParser.jjt and is also 
 used to parse constants so we need to be careful if we want to make changes 
 to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-698) Simple join fails on variable-length records

2009-03-05 Thread Peter Arthur Ciccolo (JIRA)
Simple join fails on variable-length records


 Key: PIG-698
 URL: https://issues.apache.org/jira/browse/PIG-698
 Project: Pig
  Issue Type: Bug
  Components: impl
 Environment: Yahoo! clusters.
Reporter: Peter Arthur Ciccolo


Joins can fail with an out-of-bounds access to fields that are not referenced 
in the script when variable-length
records are involved.
Example by Ben Reed:
i1:
1   c   D   E
1   a   B

i2:
0
0   Q
1   x   z
1   a   b   c


i1 = load 'i1'; 

   
i2 = load 'i2'; 

   
j = join i1 by $0, i2 by $0;

   
dump j

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-627) PERFORMANCE: multi-query optimization

2009-03-05 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated PIG-627:
---

Attachment: file_cmds-0305.patch

This patch is for the multi query branch again. It mostly fixes the problem 
with certain commands in the script that require immediate execution (in batch 
mode).

So if you do stuff like:

...
store a into 'tmp_foo';
...
rm tmp_foo
...

The rm will trigger execution and the file will be there for you to delete, 
copyToLocal, move, etc. You can also use the exec statement without params in 
a script now, to force execution of what we've seen so far.

This patch also contains a minor fix with the computation of progress in MR 
jobs (which I screwed up in the last patch).



 PERFORMANCE: multi-query optimization
 -

 Key: PIG-627
 URL: https://issues.apache.org/jira/browse/PIG-627
 Project: Pig
  Issue Type: Improvement
Affects Versions: types_branch
Reporter: Olga Natkovich
 Fix For: types_branch

 Attachments: file_cmds-0305.patch, multi-store-0303.patch, 
 multi-store-0304.patch, multiquery_0223.patch, multiquery_0224.patch


 Currently, if your Pig script contains multiple stores and some shared 
 computation, Pig will execute several independent queries. For instance:
 A = load 'data' as (a, b, c);
 B = filter A by a  5;
 store B into 'output1';
 C = group B by b;
 store C into 'output2';
 This script will result in map-only job that generated output1 followed by a 
 map-reduce job that generated output2. As the resuld data is read, parsed and 
 filetered twice which is unnecessary and costly. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-627) PERFORMANCE: multi-query optimization

2009-03-05 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12679500#action_12679500
 ] 

Gunther Hagleitner commented on PIG-627:


Oh. I also took out the restriction of the openIterator in batch mode. That was 
no longer needed.

 PERFORMANCE: multi-query optimization
 -

 Key: PIG-627
 URL: https://issues.apache.org/jira/browse/PIG-627
 Project: Pig
  Issue Type: Improvement
Affects Versions: types_branch
Reporter: Olga Natkovich
 Fix For: types_branch

 Attachments: file_cmds-0305.patch, multi-store-0303.patch, 
 multi-store-0304.patch, multiquery_0223.patch, multiquery_0224.patch


 Currently, if your Pig script contains multiple stores and some shared 
 computation, Pig will execute several independent queries. For instance:
 A = load 'data' as (a, b, c);
 B = filter A by a  5;
 store B into 'output1';
 C = group B by b;
 store C into 'output2';
 This script will result in map-only job that generated output1 followed by a 
 map-reduce job that generated output2. As the resuld data is read, parsed and 
 filetered twice which is unnecessary and costly. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.