[ 
https://issues.apache.org/jira/browse/PIG-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2315:
------------------------------
    Attachment: PIG-2315-1-rebase.patch

This has been on my todo list forever... Finally took a look at the patch.

Uploading a rebased patch to trunk. (PIG-2315-1-rebase.patch)

[~daijy], can you take a look?  

One feedback.  
With current patch, added typecast foreach is no-op.
For example 

{code:title=input.txt}
1.1
2.2
3.3
4.6
5.7
{code}

{code:title=test.pig}
A = load 'input.txt' as a1;
B = FOREACH A generate (double) a1 as (a2:int) , 20151205 as 
(generated_date:chararray);
store B into '/tmp/deleteme';
{code}

Extra typecast foreach is added and logical plan looks like 
{noformat}
#-----------------------------------------------
# New Logical Plan:
#-----------------------------------------------
B: (Name: LOStore Schema: a2#1:int,generated_date#7:chararray)
|
|---B: (Name: LOForEach Schema: a2#1:int,generated_date#7:chararray)
    |   |
    |   (Name: LOGenerate[false,false] Schema: 
a2#1:int,generated_date#7:chararray)ColumnPrune:OutputUids=[1, 
7]ColumnPrune:InputUids=[1, 7]
    |   |   |
    |   |   (Name: Cast Type: int Uid: 1)
    |   |   |
    |   |   |---a2:(Name: Project Type: int Uid: 1 Input: 0 Column: 0)
    |   |   |
    |   |   (Name: Cast Type: chararray Uid: 7)
    |   |   |
    |   |   |---generated_date:(Name: Project Type: chararray Uid: 7 Input: 1 
Column: 0)
    |   |
    |   |---(Name: LOInnerLoad[0] Schema: a2#1:int)
    |   |
    |   |---(Name: LOInnerLoad[1] Schema: generated_date#7:chararray)
    |
    |---****B: (Name: LOForEach Schema: a2#1:int,generated_date#11:chararray) 
        |   |
        |   (Name: LOGenerate[false,false] Schema: 
a2#1:int,generated_date#11:chararray)
        |   |   |
        |   |   (Name: Cast Type: double Uid: 1)
        |   |   |
        |   |   |---a1:(Name: Project Type: bytearray Uid: 1 Input: 0 Column: 
(*))
        |   |   |
        |   |   (Name: Constant Type: int Uid: 11)
        |   |
        |   |---(Name: LOInnerLoad[0] Schema: a1#1:bytearray)
        |
        |---A: (Name: LOLoad Schema: a1#1:bytearray)RequiredFields:null
{noformat}

Original foreach 
{{B: (Name: LOForEach Schema: a2#1:int,generated_date#11:chararray)}} 
should have been 
{{B: (Name: LOForEach Schema: a2#1:double,generated_date#11:int)}}

otherwise these extra typecasts are only doing double2double and 
chararray2chararray.

I believe we need to drop the types from userdefinedschema for the _original 
foreach_ since the inserted foreach-typecast handles that part.

> Make as clause work in generate
> -------------------------------
>
>                 Key: PIG-2315
>                 URL: https://issues.apache.org/jira/browse/PIG-2315
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Daniel Dai
>             Fix For: 0.17.0
>
>         Attachments: PIG-2315-1-rebase.patch, PIG-2315-1.patch, 
> PIG-2315-1.patch
>
>
> Currently, the following syntax is supported and ignored causing confusing 
> with users:
> A1 = foreach A1 generate a as a:chararray ;
> After this statement a just retains its previous type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to