[
https://issues.apache.org/jira/browse/PIG-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865023#comment-13865023
]
Edmund Dorsey commented on PIG-1991:
------------------------------------
Stumbled across this request while trying to figure out why I couldn't access
my fields in Avro using Pig 0.11. Not sure if it's appropriate to leave a
comment here but given what seems to be the widespread adoption of Avro with
Hadoop not supporting underscores seems to mean the Avro schema cannot use
field names with underscores. In our case all our "reserved" field names start
with a leading underscore and as a result we have not been able to use Pig with
Avro as we can't access any of the fields with the leading underscore (we get
the error "Unexpected character '_'").
It seems like adding underscores as an allowed character in variable names
would be completely backwards compatible and it would also bring the variable
naming convention closer in line with the Java naming conventions used by Avro.
(Note that I'm still pretty new to Pig so maybe there is a workaround I'm not
aware of that makes this whole point moot)
> Leading Underscore (_) not allowed in schema names
> --------------------------------------------------
>
> Key: PIG-1991
> URL: https://issues.apache.org/jira/browse/PIG-1991
> Project: Pig
> Issue Type: Wish
> Components: grunt
> Affects Versions: 0.9.0
> Reporter: Viraj Bhat
>
> I have a Pig script which uses underscore in its schema name (_a)
> {code}
> a = load 'test.txt' as (_a:long, b:chararray);
> dump a;
> {code}
> This causes an error in Pig:
> {quote}
> <line 1, column 24> Unexpected character '_'
> 2011-04-12 11:58:59,624 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
> 1200: <line 1, column 24> Unexpected character '_'
> {quote}
> Stack trace:
> Pig Stack Trace
> ---------------
> ERROR 1200: <line 1, column 24> Unexpected character '_'
> Failed to parse: <line 1, column 24> Unexpected character '_'
> at
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:83)
> at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1555)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1527)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:582)
> at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:917)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
> at org.apache.pig.Main.run(Main.java:489)
> at org.apache.pig.Main.main(Main.java:108)
> ================================================================================
> Schema names should be allowed to have underscores.
> Viraj
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)