Parser fails to recognize semicolons in quoted strings
------------------------------------------------------
Key: PIG-1581
URL: https://issues.apache.org/jira/browse/PIG-1581
Project: Pig
Issue Type: Bug
Components: grunt
Affects Versions: 0.7.0
Environment: CentOS 5.5
Reporter: Christopher Hackman
Priority: Minor
Within some contexts, the parser fails to treat semicolons correctly, and sees
them as an EOL.
Given an input file:
/test1.txt (in the hdfs)
1;a
2;b
3;c
4;d
5;e
And the following Pig script:
REGISTER /tmp/piggybank.jar ;
DEFINE REGEXEXTRACTALL
org.apache.pig.piggybank.evaluation.string.RegexExtractAll();
lines = LOAD '/test1.txt' AS (line:chararray);
delimited = FOREACH lines GENERATE FLATTEN (
REGEXEXTRACTALL(line, '^(\\d+);(\\w+)$')
) AS (
digit:int,
word:chararray
);
DUMP delimited;
I receive the following error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing.
Lexical error at line 5, column 40. Encountered: <EOF> after : "\'^(\\\\d+);"
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.