I think if there is any change that one of the interpretations could be to
allow `user` (backtick user backtick) that we should do that... Drill uses
backticks as the quoted identifier (I gather Postgres uses double quotes as
it's identifier) having user be the column name will same many
organizations from errorless "wrong" results... I just worry about the down
stream there... I'd probably feel less strongly about this if A. the word
was less likely to be a column name (like current_drill_user vs user) and
B. I wasn't a user with a background in security and see all the data
sources that use "user" as a column.   I just have this sinking feeling it
will lead to bad things in data analysis for many users.

I guess it comes down to the principle of least surprise, I believe that in
this case, `user` providing the the column name user is less surprising
then `user` providing the current drill user, and if we can make it align
with out SQL systems (like how Postgres behaves with ITS quoted identifier)
than I think we have a compelling case for changing how drill reacts here.

Thoughts?

(Once again, thanks for continued follow-up here, I love talking about
issues like these)

John



On Mon, May 23, 2016 at 10:39 PM, Jinfeng Ni <[email protected]> wrote:

> mydb=# select  "user" from t1;
>  user
> ------
>  ABC
>
> I should take back what I said. With quoted identifier, Postgres
> behaved different from Drill. Both of the interpretations seem to be
> reasonable, since the identifier could represent two different things.
>
>
> On Mon, May 23, 2016 at 7:41 PM, Zelaine Fong <[email protected]> wrote:
> > Jinfeng,
> >
> > What does postgres return for the following query in your example?
> >
> > select "user" from t1;
> >
> > -- Zelaine
> >
> > On Mon, May 23, 2016 at 7:39 PM, John Omernik <[email protected]> wrote:
> >
> >> Hmm, you are correct, I don't have to like it :) but there is both logic
> >> and precedence here.  Thanks for following up
> >>
> >> John
> >>
> >> On Monday, May 23, 2016, Jinfeng Ni <[email protected]> wrote:
> >>
> >> > An quoted identifier is still an identifier (Drill uses back tick as
> >> > quote). Per SQL standard,  identifier CURRENT_USER / USER/
> >> > CURRENT_SESSION/etc are implicit function calls; no () is required.
> >> >
> >> > I checked Postgre, and seems it has the same behavior.
> >> >
> >> > mydb=# create table t1 (id int, "user" varchar(10));
> >> >
> >> > mydb=# insert into t1 values(100, 'ABC');
> >> > INSERT 0 1
> >> >
> >> > mydb=# select * from t1;
> >> >  id  | user
> >> > -----+------
> >> >  100 | ABC
> >> > (1 row)
> >> >
> >> > mydb=# select user from t1;
> >> >  current_user
> >> > --------------
> >> >  postgres
> >> > (1 row)
> >> >
> >> > mydb=# select t1.user from t1;
> >> >  user
> >> > ------
> >> >  ABC
> >> > (1 row)
> >> >
> >> >
> >> >
> >> > On Mon, May 23, 2016 at 5:12 PM, John Omernik <[email protected]
> >> > <javascript:;>> wrote:
> >> > > Can (should) things inside back ticks be callable? I guess this
> makes a
> >> > > very difficult situation from a usability standpoint because user
> is a
> >> > not
> >> > > uncommon column name (think security logs, web logs, etc) yet in the
> >> > > current setup there is lots of possibility for assumptions on
> calling
> >> > back
> >> > > tick user back tick and without an error users may have wrong, but
> >> > "error"
> >> > > free results.
> >> > > On May 23, 2016 4:54 PM, "Jinfeng Ni" <[email protected]
> >> > <javascript:;>> wrote:
> >> > >
> >> > >> The problem here is that identifier 'user' is not only a reserved
> >> > >> word, but also represents a special function ==  current_user()
> call.
> >> > >> The identifier 'user', whether it's quoted or not, could mean
> either
> >> > >> column name or the function call.  Without the table alias, it
> could
> >> > >> be ambiguous to sql parser. The table alias informs the parser that
> >> > >> this identifier is not a function call, but a regular identifier,
> thus
> >> > >> removes the ambiguity.
> >> > >>
> >> > >> This is different from other cases you use quoted reserved word to
> >> > >> represent a column name, since those reserved words do not
> represent a
> >> > >> special function, thus no ambiguity.
> >> > >>
> >> > >> select `update`, `insert` from dfs.tmp.`1.json`;
> >> > >> +---------+---------+
> >> > >> | update  | insert  |
> >> > >> +---------+---------+
> >> > >> | abc     | 100     |
> >> > >> +---------+---------+
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Mon, May 23, 2016 at 10:44 AM, John Omernik <[email protected]
> >> > <javascript:;>> wrote:
> >> > >> > Ya, as I am testing, this works, however, the users of the system
> >> > expect
> >> > >> to
> >> > >> > be able to use `user` and while I can provide them instructions
> to
> >> > use a
> >> > >> > table alias, I am very worried that they will forget and since it
> >> > doesn't
> >> > >> > error, but instead puts in a different string, this could lead to
> >> bad
> >> > >> > downstream results...
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > On Mon, May 23, 2016 at 12:41 PM, John Omernik <[email protected]
> >> > <javascript:;>> wrote:
> >> > >> >
> >> > >> >> I filed https://issues.apache.org/jira/browse/DRILL-4692
> >> > >> >>
> >> > >> >> I see an alias would work as a tmp fix, but this should be
> address
> >> (I
> >> > >> >> wonder if other words may have a problem too?)
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > >> >> On Mon, May 23, 2016 at 12:38 PM, Andries Engelbrecht <
> >> > >> >> [email protected] <javascript:;>> wrote:
> >> > >> >>
> >> > >> >>> Hmm interesting.
> >> > >> >>>
> >> > >> >>> As a workaround just use a table alias when referencing the
> >> column.
> >> > >> >>>
> >> > >> >>>
> >> > >> >>> Might be good to se if there is a JIRA for this, or file one if
> >> not.
> >> > >> >>>
> >> > >> >>> --Andries
> >> > >> >>>
> >> > >> >>> > On May 23, 2016, at 10:28 AM, John Omernik <[email protected]
> >> > <javascript:;>> wrote:
> >> > >> >>> >
> >> > >> >>> > I have data with a field name user.
> >> > >> >>> >
> >> > >> >>> > When I select, with backticks, it doesn't show the field, but
> >> > >> instead my
> >> > >> >>> > current logged in user...
> >> > >> >>> >
> >> > >> >>> >
> >> > >> >>> > select CONVERT_FROM(`user`, 'UTF8') as `user` from table
> limit
> >> 10;
> >> > >> >>> >
> >> > >> >>> >
> >> > >> >>> > Shouldn't the backticks allow me to reference the field
> >> properly?
> >> > >> >>> >
> >> > >> >>> > John
> >> > >> >>>
> >> > >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> >>
> >> --
> >> Sent from my iThing
> >>
>

Reply via email to