I've got all tests passing on this. Are there other questions?
Is anyone willing to +1 ? Thanks. On Tue, Apr 14, 2020, 12:28 PM David Mollitor <dam6...@gmail.com> wrote: > Hey Zoltan, > > Thanks for the feedback and for sharing HIVE-16496. > > I think HIVE-16496 is a better approach because it allows for the standard > SQL behavior of object identifiers, but the SQL syntax is expanded (instead > of overloaded) to provide this feature. > > Also, if a user would like to do some sort of regex, they can query the > information_schema (if/when Hive gets that). > > Also, I just re-read my previous email and I do apologize, I provided the > wrong jira. The correct one for removal is: > > https://issues.apache.org/jira/browse/HIVE-23176 > > Thanks. > > > > David > > On Tue, Apr 14, 2020 at 12:16 PM Zoltan Haindrich <k...@rxd.hu> wrote: > >> Hey, >> >> I don't want to protect this feature - but I think it could be usefull; >> probably it would be ok to remove it but we should provide something else >> instead - I think this is >> the only way to "exclude" some specific columns from the output - without >> listing all the columns. >> >> How much are users actually use this feature? >> >> We had a somewhat related discussion a few years ago: >> https://issues.apache.org/jira/browse/HIVE-16496 >> >> cheers, >> Zoltan >> >> On 4/13/20 3:56 PM, David Mollitor wrote: >> > Hello Gang, >> > >> > I've been tracking a lot of issues recently regarding qualified tables >> > names, qualified table names, table names using back ticks, and other >> > similar circumstances. >> > >> > I've looked into trying to address some of these and noted that these >> issue >> > goes way back and are go all the way down to the core of Hive. >> > >> > To start with, I wanted to use the ANTLR grammar to address some of >> these >> > issues and to standardize behavior across all queries. For example, >> there >> > is currently a patch that disallows table names from having a 'dot' in >> the >> > name. I'm not 100% sure it applies to all queries, so I wanted to >> codify >> > this restriction in the parser grammar. So it got me looking at the >> > grammar. >> > >> > In parallel, I also tried to build a supplemental parser in Java for >> > parsing table names (HIVE-23150) and I was hitting some weird, and >> > confusing, edge cases bubbling up from the parser. I eventually traced >> it >> > back to the fact that there are a lot of weird rules around table names >> in >> > the grammar including something called "REGEX Column Specification." >> > >> > This feature is problematic as it blindly labels most table names as >> being >> > a regex. It really should only apply to column names, but the grammar >> > defines a table name as also possibly being a regex. There is a lot of >> > ambiguity because a table named "a" could be a literal value or a legal >> > regex. When a table name is defined as a regex, a different code path >> is >> > taken from when a table name is considered to be a literal value. Where >> I >> > first saw this issue was in a qtest where a table name `s/c` was >> producing >> > a different result than a table named `s+c`. >> > >> > This regex feature is not something I've seen in MySQL or Postgres. In >> > MySQL, any table name surrounded with a back tick can be just about any >> > UTF-8 character, so it's not really feasible to tell, without some kind >> of >> > SQL hint, that this table name is a regex or a literal value. >> > >> > This feature adds a lot of ambiguity and complexity, it is not >> supported by >> > other major RDBMS, and it adds only very minor benefit. I also hope to >> > move Hive in a direction of fully supporting UTF-8. >> > >> > I have put a patch up to remove it: >> > https://issues.apache.org/jira/browse/HIVE-23183 >> > >> > >> > References: >> > >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification >> > >> > >> > https://dev.mysql.com/doc/refman/8.0/en/identifiers.html >> > >> > >> > Thanks, >> > David >> > >> >