[ 
https://issues.apache.org/jira/browse/PIG-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311296#comment-15311296
 ] 

Daniel Dai commented on PIG-2315:
---------------------------------

+1.

Also note there is a performance regression in some cases. For example:
{code}
crawl = load 'webcrawl' as (url, pageid);
extracted = foreach crawl generate flatten(REGEX_EXTRACT_ALL(url, 
'(http|https)://(.*?)/(.*)')) as (protocol:chararray, host:chararray, 
path:chararray);
{code}

Here the users just try to give additional information to Pig since 
REGEX_EXTRACT_ALL didn't declare types inside tuple and not intend to cast. 
With the change, Pig force a cast and there is no way to avoid that. The 
performance hit should be small and I believe it worth to clarify the syntax.

> Make as clause work in generate
> -------------------------------
>
>                 Key: PIG-2315
>                 URL: https://issues.apache.org/jira/browse/PIG-2315
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Daniel Dai
>             Fix For: 0.17.0
>
>         Attachments: PIG-2315-1-rebase.patch, PIG-2315-1.patch, 
> PIG-2315-1.patch, pig-2315-2-after-rebase.patch, pig-2315-3-merged.patch
>
>
> Currently, the following syntax is supported and ignored causing confusing 
> with users:
> A1 = foreach A1 generate a as a:chararray ;
> After this statement a just retains its previous type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to