Gianmarco De Francisci Morales created PIG-2691:
---------------------------------------------------

             Summary: Duplicate TOKENIZE schema
                 Key: PIG-2691
                 URL: https://issues.apache.org/jira/browse/PIG-2691
             Project: Pig
          Issue Type: Bug
            Reporter: Gianmarco De Francisci Morales


TOKENIZE produces a fixed named schema that results in duplicates if used more 
than once in the same generate statement.
We could paramenterize the schema on the name of the field being tokenized.

{code}
grunt> q = LOAD 'file' AS (source:chararray, target:chararray);
grunt> e = FOREACH q GENERATE TOKENIZE(source), TOKENIZE(target);
2012-05-09 20:18:37,235 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1108: 
<line 2, column 14> Duplicate schema alias: bag_of_tokenTuples
grunt> e = FOREACH q GENERATE TOKENIZE(source) as s_entities, TOKENIZE(target) 
as t_entities;
grunt> describe e
e: {s_entities: {tuple_of_tokens: (token: chararray)},t_entities: 
{tuple_of_tokens: (token: chararray)}}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to