[ 
https://issues.apache.org/jira/browse/HIVE-23150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077291#comment-17077291
 ] 

David Mollitor commented on HIVE-23150:
---------------------------------------

[~kgyrtkirk] Thank you for your interest in the topic.

As I understand the problem, it goes beyond just the table name.  Any 
identifier (table, column, view, database, etc.) follows this same format so I 
wanted to create a generic tool for parsing all of these items.

 

The problem, as I have come to understand it, is that there have been several 
places where the code does not handle a dot in the identifier at all and parses 
the TAB_TOKEN as a possible 'db.table' name.  The solution therefore was to 
simply deny the use of a dot in the identifier name.  This is a fine stop-gap 
measure, but it does not fix the fact that the available parsing logic is 
scattered across the code base and is implements in a pretty naive way.

 

There are several examples, but the one I *think* is being used most is:

[https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/common/TableName.java#L70]

 

It tries to split the table name on dot ('.').  This is incorrect.  A valid 
table name is: {{`my.table.name`}} and it would fail this code.  Also, the 
ANTLR parsing code *should* (and in at least the cases I've specifically looked 
at) parse the statement and create a {{TOK_TABNAME}} that has one or two 
children (Table or DB/Table), so there should be no need to trying to 
split/parse the table name, in this way, internal to Hive.

[https://github.com/apache/hive/blob/dea35b4fd65fc6b4573133aa0b83000bcddd42b6/parser/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L212-L221]

 

This is not the best long-term solution, to better inter-operate with MySQL, we 
should support the same semantics.  The solution I propose here supports the 
dot in the name and properly handles it (unless someone can present a unit test 
otherwise).  How we integrate this, and leverage this parser, will need to be 
done in a phased approach for sure.  It's a good backup that areas that do not 
get this integration instead throw an Exception.

> Create an Object Identifier Parser for All Components to Use
> ------------------------------------------------------------
>
>                 Key: HIVE-23150
>                 URL: https://issues.apache.org/jira/browse/HIVE-23150
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>         Attachments: HIVE-23150.1.patch
>
>
> Create a parser for parsing (and validating) MySQL/MariaDB style object 
> identifiers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to