[
https://issues.apache.org/jira/browse/HIVE-23150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077291#comment-17077291
]
David Mollitor commented on HIVE-23150:
---------------------------------------
[~kgyrtkirk] Thank you for your interest in the topic.
As I understand the problem, it goes beyond just the table name. Any
identifier (table, column, view, database, etc.) follows this same format so I
wanted to create a generic tool for parsing all of these items.
The problem, as I have come to understand it, is that there have been several
places where the code does not handle a dot in the identifier at all and parses
the TAB_TOKEN as a possible 'db.table' name. The solution therefore was to
simply deny the use of a dot in the identifier name. This is a fine stop-gap
measure, but it does not fix the fact that the available parsing logic is
scattered across the code base and is implements in a pretty naive way.
There are several examples, but the one I *think* is being used most is:
[https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/common/TableName.java#L70]
It tries to split the table name on dot ('.'). This is incorrect. A valid
table name is: {{`my.table.name`}} and it would fail this code. Also, the
ANTLR parsing code *should* (and in at least the cases I've specifically looked
at) parse the statement and create a {{TOK_TABNAME}} that has one or two
children (Table or DB/Table), so there should be no need to trying to
split/parse the table name, in this way, internal to Hive.
[https://github.com/apache/hive/blob/dea35b4fd65fc6b4573133aa0b83000bcddd42b6/parser/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L212-L221]
This is not the best long-term solution, to better inter-operate with MySQL, we
should support the same semantics. The solution I propose here supports the
dot in the name and properly handles it (unless someone can present a unit test
otherwise). How we integrate this, and leverage this parser, will need to be
done in a phased approach for sure. It's a good backup that areas that do not
get this integration instead throw an Exception.
> Create an Object Identifier Parser for All Components to Use
> ------------------------------------------------------------
>
> Key: HIVE-23150
> URL: https://issues.apache.org/jira/browse/HIVE-23150
> Project: Hive
> Issue Type: Sub-task
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Attachments: HIVE-23150.1.patch
>
>
> Create a parser for parsing (and validating) MySQL/MariaDB style object
> identifiers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)